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ABSTRACT 



Thte thesis ajdreeaee the problem of automettoally conatructing thf coda 
generation phase of a ob mpltsr from a speorheetion ef tha eouroe language «nd 
tar oat maonina* a .. jramnwonf iojc. suaa s..r^PSS'4MHMp?'isi4pSSPT^'' * aww** 
Infor mati on about language- snd mao h l n e d epe ndent ■sms ntlo i It moorporatsd •« • 

program. Tha I nt e rme d iat e language **»« t a fvaa aa tha Internal repreaentatlon, 
and tee metalanguage m which tha, tran« f o tp i dkwa ara-#amar» ara dtoc«iMad in 
detail. 

Tha major goal of this approaoh la to eeperate meohtrte- and language- 
dapandant knowledge (as embocfted In, *p* p*t*m * ^ ^Wo9tf §wat general 
knowladga about ooda generation* TMs genera) tenowl iags t» euapeed by tha third 
component °t th* U t mm ii ^ i ^ m^^Xmpmm J^m^9*^^ * faj^comptete 
repertoire of langusge- and iM^MneHrKiependent optimization algorithms for 
IrrtensedJcte lar!a^ j^*"^ ^ ajp r ojj p iai ^aajofffo 

and applying tr an sf oraat kyv i from tha timnaformattan cat alog ue . Tha three- 
oomponant fr am e work described In tha ..tymais afflrtdM a aj|e$poaaeJk4het can 
aaafly be tetored to new languages and maoiea s arohttaeturas without 
oonuirQiMslQgrthe ajpty to generate openurf oede^ ,-. v*.-^ tr ->r; 
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$1.1 Introduction 

Ttw creation of a oowpHsr for a speetfle Isngusgs •** tsrgst r«cMm is an 
arduous process. It Is not unoomaon to Invest ssvsraJ yssrs In tits production of 
sn sccoptobJo compeer; the excellent oomptsrs available for PL/1 on MULTICS and 



Systssi »/0 evolved over a dsoads or mors. with ths rapid development of nsw 
computing hardware and ths proliferation of h i g h -lev el languages, suoh an 
Investment Is no tongs? practical, especially If thsrs Is Httls carry-over from one 
Implementation to the next 

Compil er writers currently suffer from ths same malady as ths shoemaker"* 
children: they seem to be the Isst to bene* from the Impr ove m ent s In compiler 
Isnguege technology that problem oriented Unouaos prce as eors havs Inoorporstsd. 

.... ,-.,-■ -.1. ■•■•■■■ i ■■;)'. >.n- ■■•■"<.'-.. rts;.""'/"""-'' . I:''? a p *»* - ** ^C- ■ "' i t3-''.''r'*fr'* 

Ths current research has been dtectsd towards provMtafl the o om p l t o r writer with 

..•.■■ - ,.-.: ■ , ■■• • >-'.*•-• t . . ;q .« "*: t*Jnrs ov, '• -*" -u ? r ', 

the ssme high-level tools tbst he provides for others. In an effort to automate 
oompNer production, systems havs been developed to automatically generate thoss 
portions of ths oompMsr which translate the source language program Into an 
Internal form suitable for oods generation. These systems hsve enhanced 



portability and sxtsnslbWty of ths resultant pom pier without a significant 
degradation In Its performance. Ths find phas e s of s s ample r, those concerned 
with code , Qs n s faU o By are now oomtnp under a saoHor sersilOH «!!■«»< drffersnt 



approaches ate Bf seisin Case I-M& W* tiPjmls addi « Meeitii i Iwaj ii ^of i>rovldhig a 
specification of a ood« g s n s rstor . Such a » p a rHh iM tkwi m oa>»st ruot s # by th« cods 
generator- deelgner wUMn a r rawawor H p rmrtd s d by an immmttim* Isnousgs (IL) 
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::s¥«*Sfv: 



and a tmUd ntvprmUr . The I nt e rm edi a te tanguaoe It UMd as the Internal 
representation of the cede g ews rator - the Wtle* Input (provided by the first phase 
of the oomptter) Is a asuroa Imigusg* pragma* jiijiiossad aa an It program; the final 
output Is the It reprssentstton of the target machine program. The metainterpreter 
has s detailed understartdmoj of Hw laaia iitie e of U. .piogyams and la eapable of 




semantics of It ere emttod to sonsspts osmmBn to many 

flow of oontrat and the 

concepts. Spcotfloetion of 

aamantloa of I ndMdaal o p era tor s ) are p rovided by the 



and machines: 
ate the only primrttvs 
aamantloa (e.g., the 
In the form of a 
as common ground 



transformation o a tatogu a. hi e ss en ce , the ■s man ttos of It 
on which the dealgner (through the transformation oatatogue) "explains" the source 
language end target m aatUna to tha awtaJntarpratar which then performs the 
appropriate translation. This "axptenatkm" la in terms of a step-toy-step syntactic 



^ftO™* 



manipulation of the It program) eeoh 
information for the metelnte rpre ter or provides 



aeoumuistas additional 
tranalatlons for It 
Sines the 



ststements which era not yet target maohJna Instructions 
metamterpretar Inoorporatee many of the optimisations oommonty performed by 
compilers, the spo oM teati on need not supply detai l ed Implamsntatlon desorlptlens of 
these operations. 

One can envision several dmtinct uses for such s spaoWontion: 

s as s convenient way of re e tao l ng aagtsh Sas orlpaoiw of en algorithm 

(much the ~ 



a ea a program which, 
Interpreted to produce en 



'm 



■fanalalhoh Ce*g>, syntax 



g e nerato r 
compiler). 



to a 
(elmear to 



s oode 
fed to a oompHer- 



2. 
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Each successive use raquiraa a mora thorough undaratanding of tha apaoiflcatlon 
but rapaya thia Investment with a corresponding Incraeee kr the degree of 
•utomatlon acnievad. The Incrcass is baaed for the moat part on -# battar 
undaratanding of tha Interaction betw ee n aaMponanta of tha •pacification. 
Automatic creation of .agenda.,, generate flam a i p * 1 Hwai lmi wouM require 
extensive anatyals ofJthea* Interactions, « aapaoNly-oah/ ace* 40** e m erging from 
artificial IntaMganca raaaaroh on program synti\esls [Beretow]. Fortunataiy moat of 

..... ... , v .. ..,.,.■.-■"■■-. : ■, -' 7' .■ ^ *"■;;.!( '?■ ,-';". -'---i ''■•■f^. ~*.?%i '■ ,:: , \. " ■'' ,' ■ ^S ■:■■ '*"•>.: "-*'■-- ' "■■■ ■. "- 

the analytical mechanism required w In addition to tha facttttiea provided by tha 
matalntarpratar and tnt a rm adl at a languaga - It w raaaonabla to expect that futura 
research wNf be able to axtand tha framewo rk de sc ribed In tha pracading 
paragraph to sMbw automatic construction of a code ganarator. Thia thesis 
concentrates on developing the fra maworlc to tha point where it can ba uaad 
Intarprath/aly (aa suggested by tha seoond uaa)i I mp l e ment ed hi a •tralghtforward 
fashion, too matalntarpratar oan parform the tranalatlon by aJtemetely applying 
transformations from tha catalogue and optimizing tha updatad IL program. While 
this approach is admittedly lass aidant than currant coda generators, it 
represents a significant atap towards ssparating m aohin a and languaga 
dapandancias In a dadarativa form (tha transformation cataiogua) from general 
knowledge about code generation (embodied hi tha matalntarpratar). 

Tna following aeetion providas a brlaf ow o rv m w af tha tasks con f ron t i n g s 
ooda ganarator. tf.$ prsssnts a summary of tha eaflent faaturaa of IL, tha 
transformation cataiogua, and tha metsJnterpreter. hi ft .4, raiatad work la 
chsoussad wtth an aya towards p rov i ding a g eneal o gy for tha raaaaroh raportad 
hara. Finally, $1.6 outlines tha organization of tha r e maind e r of the thesis. 
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'' i ^ftf«SSi«k*. 



$1.2 



chsrseteriae the 



4M ' S 



*f #» 




fonnsflsm, let in first 



*f ■ • r ep rse si iUtk in (In some 

in the orlflJnal 

to be 



sequence of machine 
computation. The 



The Idea, of course, w mat by executing the 

inetructiona the target mcehJae WW carry out the 

remainder of this section outline, the tasks ocwtwn fJ nu a code generator; our 

objective Is to ofcotoh the variety of know le dge reeded for making decisions during 

code generation and how current 



An optimizing code 



this knowledge. 



Is organized around three main tasks: 



t ransl a tion to target 



Machine-independent optimizations Jnokids global 

propagation, common siihswprsosk 

these transformations modify the 

strictly equivalent (I.e., eq uiva l ent regard les s of the 

Certain of these transformations do make 

machine architecture; for Instance, constant imnimejmlihMi 

efficient to access s constant than a variable. The 

generators [Wutf] do not actually modify the semantic tree 

alternatives for eaoh node In the trae*. 



t They do not, however, Hst all _ 
combinatorial growth of the som s ntlc 
program accounts for the 



analysis, constant 

elimination, etc. — , 

• new tree which is 

of target machine). 

ebotrt the target 

that It is more 

sophisticated code 

- «»ey maintain a Hst of 

chaise of transformation 



erternetlvea as this would result in the 
tree. Searching the ful tree for the optimal 
■— of the cede gsner s tton problem [Aho773. 



Chapter «ne - Betting the stags 



until ths translation phase. 

The translation to target machtno IrmtnietkiiM tsxeeplsoe In ssverat stages: 



(I) i, ajlooatod for v a rtebto t muTsamtsn* used In ties source; 

program. The s sm antiow of ths nrlpi i ottJff' H u ulw -spectfte 
etoeation strategies (eg., stacks). 

ft) Algorithms which Implement ths required computations (FOfMoops, 

(M) The of der J^ wblcb oq^u t stto na arc IP be p s r ss qn sd to awisrmtaed. - 
Through ths ds t sctl on of rsdundsnt oomputsttons. It to often possible 

, to PffW^ JP» #*lM»ttaa ^^^ tsw>«mi 

tpsce In ths reeuMng cods whbo msJhteMng tr» correctness of the 

COmOUtetiOaj *, • ;*■> 



*k~-. 



(iy) Actual tagyt, , f machlno jnatruottasjt ara>- ^.generated. . . Mschine- ' '■ ■ 
dependent considerations (such as locations of operands for 

. ss#kHjaf np s rstlno s. tiis laofc of Mw si oirt s sJ u wm s sttM, etc.) enter 
at ttislsvol. 

from tiis many possible transformations applicable to a particular source program, 

an optimizing code generator chooses some subset to produce ths "bsst" 

translation, these transformations ars Interdependent and an a priori 

n: -,,,-, ■ .,- . .- - ■ soub* 

determination of tiidr combined effect to drfnourt 



m^-tm -.'♦ Vi*.^;;-. ,,v -i. i-.:-!-- •*:: «. 



MsoMne-dependent (peephole) optimization [McKeeman, Wolfs Chspter 6] of 

.-.,-..- ... _ , -'. " fam^'i-w 5 ' . •- " ' . §*'- K: -' 

Instruction sequences can be used to Improve the generated code - Just how much 

Improvement csn be msde depends on ths sophistic ation of ths translation phase. 

The goal is to substitute more efficient instruction sequences for small portions of 

ths code. Examples! e limin ation of Jumps to ether Jumps and cods following 

unconditional Jumps, use of short-sddrsss Jumps (limited hi How far they can Jump), 

attmlnatton of r e dun snt s tore to a d es^s iaism^ iatagfTh^ no 

more Improvements can be made. Before thm feeds* dhms)s»s» this flnsl phase as 



"trivtol," he shouW consider thto eorament from fWhibYpg, 124*3* 

» *■■ , . v - ■ t • -ft.««'&'* «fe •'■* 

... sj) the, fancy, opWimsatton In the world Is not nearly as wtportant as 
careful sejL jbo fP W JV ew rt o l W l on of- tint mm^*****^*'^ ' 
dlffksutt to determine to what extent [this tinsl phase] weettf be 
needed If more complete algorithms, rather than heuristics, existed in 

Chapter One - Setting ths stsge 6. 



'***» :*^^'i--^wnwppwf» nmsvsfi 4Mvv some or uis 
operationa of [this Unci pttm] exist almpty henouss the requisite 
m a mmltm Mm imt sm i st smrtor, W-mm*c* ** n*rm *m always 

SOT O rOM WOT •>{OTMMT ' MMMVf .<• 



It should be noted that relatively f ow of tfM ircraf ortMttons mentioned 
above vo uniformly opposable. Unfortunately, the o c nvofttfcwial control structures 
upon which extant code awi t o iaiors or* taMtf preclude « trteJ-and-error approach 
to optmdsatlon. Trw proarammer, using f»l« knowl edge o the target machine 
srcMtecturo, mutt, out of neoeeerty, Inoorporato in tho coo* generator etther some 
aubsat of tho appHaaMa tranafsrm attona or hmiri*H* to aafoct tho -boat" 
transformation at ep oe tin palate m tho ooda generat io n p*ajsm . Thee* heuristics 
baaa thair decisions on a local examination of tho troo; more far-reaching 
conaequencea are dHncult to determine - thoa, moat heuristics "work" for only a 
aubeet (albeit large) of the peaalble program*. Although the eompromtaea Inherent 
In heuristics aorve primarily to reduce the amount of computation needed to 
complete the translation, they amo embody k nowl e dge helpful m ttie generation of 
ooda. Some of theee transformations are of general uae hi that they are 
Independent of both the Intermediate representation and the target machine; those 
transformation, form a nucleus of knowledge for the portable code generation 
system. 

$1.3 hitroduotton toaVML 

The framework for tho ■ peohVieUui i of code generators provided by the 
IL/ML ayatem baa thro* 



• an InfrmtUm Jengmaoe <{L) wMoh serves aa the internal 
repreaentatlon for all stage* of the translation. At any given moment, 
the, m. program embedlea at mm text, aymftat table, and state 
informatian s p p um u tat a d by tho ooda ge»Sr*t*up %*** pemt m the 
tranelettorv 

• a trarmtotmtion eofa/ooue whoee component tr an af or m a Uona are 
expreaaed In a corrtsxt-sensrttvs pattern-matching msteiangusgs (ML) 

8 * Chapter One - Introduction to IL/ML 



as pattern/replecament pair*. The pattarn apa c W a t tha oontaxt of 
tha transformation aa an IL program f ragm e nt ; the replacement la 

a a matW/i(arpra^ liKiprnor«tlno a falfly ^a ^wp| t ta ^.. f a gafte i ca of . 
machine- and lanajiage-independent optimisation algorithms for IL 
program*. Tna. jpataMtaipratac m eteo, p a a a hja et s a l a st i n o ana- 
applying tranafo r mati o n a from tha tranaformattan catalogue. 

Within thw framework, coda generation may ba viewed aa foHowa': tha 

tranaformatlon oatalogua * aearched by tha metainterpreter untM a pattarn hi found 

that matehaa "acme fragaent of the current IL program, than tha corresponding 

raplaoamant la •ubstttutad for tha matchad fragment erecting an updated vereton 

of tha IL program. Next tha metamtarpreter optlmlzee tha now veraion of tha 

program utfflztng naw toifomiatlon and opportunities praaantad by tha tranaformatlon. 

This cycle la repeated uati no further matehaa can ba found, at which point tha 

translation la completed. The simplicity of the meohanmm, along with the modularity 

of the tranaformatlon data base, make this an attractive baala for a code generator 

specification. 

Only concepts cdlaaon to moat machine and source language programs have 

been incorporated Into IL and the metainterpreter - concepts apecNIo to a machlna 

... ,t . w \ , - «e > **« • • - r - <• ■■ . 

or language ire introduced by the designer through tha tranaformatlon catalogue. 

Many of these new ooncapts need never be related to the primroVee of ILi they 

can be Introduced Into tha tt. program as attr/butes of coma component of the IL 

program where they can be referenced by trenafenftstione. Tha aemantlcs of 

these attributes ere established by ths rote they play In various transformation*; 



t Thia description Is only a oonceptuaJ models In a code generator constructed 
from _. ^.toJi-.^jaifa^oallpn^^jAi^, do^h p ^a m Igh o /ij a t , : . ifhv0fi9^S9 ^saad&epplylaa ••-■*•.. 
tranaformatlon would have been ordered by tha matauompMar and Incorporated In 
the oroanUtatton of. tb% code jsmerator ( i nt i iaj a jt a /nhk^MPH l d attssddpm *ar done). 
Some deemtone would ba made during the uon a tiucttm of the code generator, 
others would be im j sj ill i | a% a e c la|a r> -tn^ .eeat' h aaati afn i , Qthef dlatlnctloe* 
between fc toip> e tol oF and compilation of the aueuMii a tton mrm Ignored untU 
Chapters. ' •.,;■.-,<■. ■--■*:■-■ ■ -■ > ,-t-v' ::>-..-■ •^■■■-.-.. 
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for example, ft» i « »X rt on addtttan in ir to ir ndtod «my be related to the 
Integer and Ifcatbig point id d Mun m e toiott eiia of the target machine - further IL nor 
*"* metaejieBpiieaBjn'mnwo w ooppert eadMmit ee a pitojotot operation. The ability to 
expreea aouree langu ag e ii n i Mi l Hi III term* , ** other, rtipli tritlont and, 
ultimately. In tMW of target w i n him *e*uo»m, wtotoi* mm to, eotae fixed 
eemanttee enewe great fln P UHj without any < altomtoat complexity , In the 
Intermediate languogo OJ^motototorprotor . 

But tent It " om i tt i ng" to require to* eeetpner to opoi out source language 
eemanttee In tmmm of tore** amohtoe Inotruotkm eee*eneee? Doesn't that raise 
tho objection to conventional oodo gonorotom, vfe that a large Investment is 
necessary to redo the tranaurtlana when another target machina or source 
language la to be auoo mn wda tmff Ma, not realty. There la no "magic" provided by 
the It/ML ayetom - toa e om antw a of the aource language end target macjrine must 
atweye be deecrlbed by the des igner In any tndy l a n g u a g t and apoMne- 
Independent ayetom. However, their meet natural (and uaeful) deaortptlon it In 
terme of one another - after eft, the daa lgn a r In theory. tuto urwierstands both and 
the simplicity of the HTbM. ayateej aMmtaea the need for eapertJca in any other 
languege/lnteraretor. Moreover, abioe too mo to t trtofprotor incorporates the 
necessary knowledge about general ootortzatom toehatques, toe overhead of the 
deacrlptlon la erne* compared to coding a oonvantoM code .generator. It la true 



that a more MgMy specified I ntof me di a te language semantic might bo more 



appropriate for a specif* ears* language and target a* ,jnts 

Impede toe tre ne ttl a n to other l i n gm ig ii. and target awohtoae '(bee dbacripfton of 
abotraot maohJnae m f\*k atoms njmt, to to bo mm code., 

generation system, auoh oaftetratotobave boon avoided. 



1*J ** 
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S 1.8.1 A syntactic model of code generation 

One of the most usoful discoveries of artificial Intelligence research is that 
complicated semantic manipulationa can be accomplished with step-by-step 
syntactic manipulation of an appropriately chosen data base (see^ for example, 
[Hewitt]). Thia section explores the application of this approach to the process of 
code generation. The objective of this exploration is to provide a difffrant 
perspective of the IL/ML system - hopefuly this wiH lead to a better designed 
transformation catalogue. 

One can characterize code generation as s consecutive sequence of 
transformations chosen from the transformation catalogue and applied to an 
Intermediate language Input string: 

'intermediate "* a 1 * «2 * - * »n * 8 taegeJt machine' 
8 target machine te not "•©•••artly unique; time, the code generation algorithm may 
have to choose among many translations. If the translation uses an .abstract 
machine then we wM have 

•intermediate * »1 * - "» 8 k-l - 8 AM "* Vl + " "* ?ft^ Hatae* *MiH*ie' 
The transformations leading to a^ are independent of the target machine; the 

transformations following » m are machine dependent. If we group transformations 

according to the code generation steps they describe (e.g., storage allocation, 

register assignment, etc), each group describes the translation of ^pj^amf for a 

particular abstract machine into programs for another. By denning a hierarchy of 

abstract machines, the designer can limit ths Impact of a particular feature of thf 

target machine to a few transformations. This type of organization of tiie 

transformation catalogue leads to a highly modular specification. 

As waa mentioned above, the resulting machine language program Is not 

always unique - Hi order to be able to decide among competing translations, it is 

Chapter One - Introduction to IL/ML 0. 



necessary to introduce eome measure (m) of a program's co«t: 

R:t-»Ru<i. 
This totally ordered mmuk Is to reflect the opttm.ltty of the translation; the 
smaller the measure, the more optimal the translation. Note that the measure Is not 
defined (m(a*) - •) for mtermedteta tanguaoa strings (•») that do not represent a 
completed translation. TypteaHy tMs measure Is computed from ths values of 
attributes of the statements m the fhtat program: It Is up to the designer to ensure 
that each statement Is assigned these attributes - If some statement does not 
have the appropriate attributes defined, the measure for that IL program will be 
undefined. The final choice for a given input string s and measure m is the set of 
"optimal" translations given hy 

m U) £ { s* | s -» s f and for all s" [s -» s" implies m(s') s m(s")] }. 
Mote that we restrict our notion of optimallty to those strings which can be actually 

■:V:.'K." 

derived from the initial program (a) by repeated appteattons of transformations from 
ths transformation catalogue (I.e., s ■? s\s"). It is possible that semantical!/ 
equivalent strings exist which ere more optimal but which may not be discovered 
because of some Inadequacy m the transformation catalogue. In soma sense this 
inadequacy is Intrinsic since the semantic equivalence problem is in general 
unsoh/able [Aho70]. 

In our syntactic view Of code generation, we have set forth two tasks for 
the code generator. First, It must produce a set of tranetatlona for the given input 
atrlng that meet certain basic criteria: e.g„ they must be well-formed machine- 
language programs (only theee should have the correct attributes needed to 
compute the messurs). Second, It must select one of theee translations as the 
translation. This selection is based on the opttaeHty of the translation as well as 
other constraints the user msy supply st compile time (e.g„ upper bounds on space 
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and/or execution speed). Tim Uttering process Is an expensive orw as It mssns 
diacardtng comptsfed translations - th4p :: , more r#*ti*cth/s the - ; cospitertima 
oonstrsints, . ths mora programs 4 p»y. : have la,, be dfrcew ^ before a aa*|§factory 
translation Is snoounts^d. An a^erru^ aja?g?a^ la 1$ 1*^ theas crftacia «* 
part of trsrwformations In ths oeteJooue, using ^ss^rtuil intpnsatlon to disqualify 
transformation* which foo^fci a ,^|^B^afi,^ ^ ^|s^. c ,^|ii^|d|«q^« transitions ers 
aborted bsfors the efort Is expended to (xappiets «^ The decjsloo to Include 
sssatittsJIy all ^ co ns traints ss trsjisforpstjons aa^, a parsimonious dsscripttw pf 
acoaptabla transtationa at tha coat of a<JOTt»Qr*s£trsns^^ 
automatic creation of a coda gansrator from a sat of transformations stay prompt 
us to change our minds. 

Lst us tsks • moment to outtote ths sdvsirtages snd omsdvsntagss of thl« 
approach to oods qs waratter t By modeSng oode generation as s eartea of abnpia 
syntactic tra/tsformstlons, we have removed tha onus at epedfylng ths order In 
which tha trShsfcrmattons are to be dona - we hsve ramovad tha control structure 
of tha coda generator, in its plsos we require Mat tha designer specify enough 
context for each transformation to guarantee ft wet be used onfr when appropriate. 
The merits of this tradeoff srs dtfflcyjt to Juogo. For smaH asta of trsaaf or»«ttan« 
It is simpler to omit the control structure as it w possib le to foresee undesirable 
interactions between the various tra ^d^s|r| en a ,^||M,l»yp of .a^ff^ asms. 
As the number of tra ns ro t m stkm s Incre as es, ft becomes Increasingly dNmsult to 
account for the global sffsct of an adtJWonal trsiisformstlon. Adopting a modular 
organization for the tritfwformattorm alleviates Ais problem - the use of a hierarchy 
of transformations (with Btfle overtap between fcj suppMss an implicit context 

for the transformations on s given level. There sr%s^»sj(jn/njto*i*s m eo hsnlasis for 
enforcing this modularity; several sre presented to Is^r exseo^s. Tb« gcs«test 
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advantage of a syntactic view of ends gonc r stish is that the dostgnsr Is not 
srtcumbered with the dsdalft of pwigis mm lng but Is sole to deal iti higher, more 
nature* level it d e idrtomj cods g e n er ati o n . The srtojfcjial dtoedvantega la the 
current lack of a staple techrsoue tbr r satthlg • oodo generator from tho 
trarmfermetiormv TO sfltusiry le s t — w i t a oodo gener ato r, we wffl hava to maka 
explicit the bnpffelt control structure suppled by tha coottxt of each 
transformation. Um* «M* pro b lem la solved, ft looms as ths largest barrier to 
•cceptJrto the syntactic 'view of cods generation. 



§1.3.2 Tha transformation ca ta l ogue and ma talnta r p r e tar 

8tnce the aaaiha ala in a specification Is on de scrib ing wAst the code 
generator la to do rather than htm it Is to be done, en sffprt has bean made to 
distinguish strstsgy tvam tmehmim. The strategic dsefcdsns made by a oode 
generator are s m bod tod bi the transformation i eststooos and faR roughly into thres 
categories: 

(1) sxpsnslon of a Mgh-tevel IL statement bite • serin* of mors 



(?) slmp w aoatlui i or st mhisUuii of HL statements whose o p eratio n* can be 
performed at comedo time; 



(3) transformation* on sequences of U. statements, eg., ooda motion in 
' toon*, permutation of Svalturtton order to SCh ls vs better register 
ueage, p aa pho i * ootim toatiom , etc. : 

The app«cab«tty of a tran sa min ati on to a particular IL statement dsesnds on the 

context in whtoh that sta t ame nt Simmers. In tradrdonel code generators tha 

context of an operation to eetabfahed by two I rrtsrrtsp s n d snt comptrtsttona: 

• ifw jw myslB to determin e avaNabte ■ xpr o Mkun , uss-dsflnftipn 
crHdraYu}, end Pve vsffsbtss; 

*eompffa»time oemptttatlon of values for variables and frrtermadtota 
results. 

In a IL/ML speofflcstlon, these computations neve been tooorporata«l as psrt of the 
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context matching performed before a transformation la appHad - tha daaignar 
navar explicitly Invoke* the underline, m e eh n rdta ) , ipetoad fee way .deal, directly 
with vahiaa of vartaWaa, execution order of It st at e monta , ate. aa part of an ML 
pattarn. 

Tha adequacy of IL/ML aa tha baste for a coda generator epeclflcatlon 

■ ■■■■■■■ • ■ ".-'■■- •.■■..- * v-'-'"^-- ■ ■ i -;• ' ■•.•'■■*;" i 5?=-.sw' s «s*as*??¥Si #-*&* *; e*«si? .'-..^ $?■:.■■•*/■ 
hinges on tha abMty of the pattern matcjdng J maohan lsm to axpraaa tha dasirad 



context. Tha pattern primitive* provldad by ML are baaed on atandard data flow 
anaryaia techniques ftjmean, KHdaH] and do not require extensions to tha atata of 
tha art. Fortunately, thaaa standard teohnlajuaa eaetty compute tha Information 
required by many common optimizations. Comjg;^ t . with ? .i i med||ie^^aj|mboJ^ 
computation abflrttaa (arithmetic on hrtaoa/^ oaiipBtealUatxw oT expression., ate.) 

•*»• IIUIP* Wl 9 WW" B > WtN||wHJf^ W^ qeVI B^c*, SfNjpB^ • ! ^W|Tn^^T * ■WiWIa^ir* Jy*wf*p "J* 

mechanism. r Idaafly, . tt wouM . ba ntoa to -a^ J^ani.jwd ra^ ; qn sequence* ej 

»■ MWM**li">*MM#^PJr^#e* W MMM^^M^pi^M^ ' WMPt MW1«WJ . W AlgjWp * .■p^PjeM^MP/fw«(PJeyMJBJj ^^!MPJf»» Bomr** w^mjegjw^amw 
VtJriB*?l"J ■WWWUW Iff |^^0M|SMF 9 IWrVWffp -i^pM^Hp* jdMWP-,*^M»"J|f -WamTf ^HJi^mJf c /^n9jMW«fMMnfXV^ B lK -w^ 
"W IMJMJ IfJ^IM^^ JrJ.MI IV^i • ^f**W^ WwafJjM^HJa, •• U^^ ^m*. ^^•*?^|Pf™jBW^^§M^^HP;'^*^WW*^ 5 ™™ "WP** <"* ^™* 
VwVpitJAftjr W WWlTMWIf^ mis|gjaei 'U*wV ''■''r JW^Tf *^^ff^P WWW ^^^•fWjfM^lWWI^M'^BIfJ' •••^■J •^"•P***"w/ 

apt of trenaformatiofM^,Jf ,pp^ WB0 tm-*M--tlfWh*%- to 

intbnkiata avan tha moat dpl^a|^'it«^?#:r,f'» . M l p rjlPiflfeP m T *» <#««PMn1WWi. 

(1) to express, the kernel of tt>e algorithm. *• a? a)ai r i la traaaiarmatton , T . ■, . 
(such aa" •a l igning a compter te m p o rary a from register noma) and 



mtfnnowjilKr might ba abla to r ao agnto t thaaa tranaformatlo 



cfavar matecompAar might ba abta to raoegi t ranaformatkw 

raaurOng ooda ganarator. 
(2) to hcluda buNNn i pradteatea (In tha oaaa of mduotton variabta«) or 
a aamNa tianamtmattan to partofm tha daabmd trajadatfon. To 



*•* *•*• •fN^pajajilga^'i^ajii^aB^badl^^ •■: < 

racndraa algorttrmMi th«t always "work* (La*, pfadtioa oomphita or 
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optimal re s ults ); ter many of trta tr snsfurmstiu as In qu es tion no such 
algorithm cuitentiy sxtets. 

Mother stternettes la oessrietety satisfactory and further research te needed to 

raaoh a conclusion. Jt ■> ■ ■ ■ r aasooabl a to expect en eventual resolution of thla 

testis and thsrs te sssmi evidence [ Harr i son ] that many audit optimizations may be 

Ignored without rig i dao a ntiy g-sgradlng tha uaaaWty of ttis apcotftoatian. In thte 

spirit, the romdnd a r of the th— te o o no s ntrat sa on the apocMcation of cods 

generation techniques wMeh hava a bests In flow analysis and its extensions. 



51.4 Relation to previous 

Urrtfl recently , r a saarcn Had fpcassd on two approach** for tha spsdMfcation 
of cods genetetors: titenia vedbf me n t of Mgh-tevel languages bsttar adapted to ths 
writing of cess ge n era to r a and the Introduction of an "abstract machkw- to farther 
simplify ths oode u e no i sUun procesa. Twd'-ndbf' h ig h level languages [Young] 
provide aa primitives many of ths rt s m s nt siy operations used h cods generation 
such as storage and- r a glat er a H ooa t ton artd sutomatic managoeant of Internal data 
besss <e.g.. the s ymbol table). The actual process of oode generation typically 1ms 
te a user-provtosd oode templets wttfi sundry parameters such aa tha actual 
of the o pe r and i , etc. Local optimization te aoocmpftshsd by spsclal 
> corwtructs wrtrdn tha template which asow testing for given attributes of the 

parameters. Psodutertty of ths oode generator te ImprovOd and much of the 
roschtna-d op o ndon t taftemattow is In descriptive forte. Of ooors*. the portions of the 
cods generation al g orithm and the o pt l wUaUo n m a chantem witch depend on the 
aemantica of tha source language or target machine must otto" be coded Into 
procedural form. The encoding of thte liifo r meUun (usually aa epedal cases) 
repraaenta a targe portion of many optlmWr^ compears [Carter]. 



14 - Chapter One - flotation to previous work 



** * 



The apparent dichotomy between de script ions of tho intermediate language 
and tho target machine lad to disparate ottcbsftisJM for describing, each. The use 
of an abstract maoMna (AM) capitalizes on thia dichotomy. Tha oparations of tha 
AM ere e est of low^svet Instructions based on some simple architecture. A coda 
generator baaad on an AM [Poola] perform* two translattofts: first tha parse tree Is 
translated mto a s e qu e nc e of AM operations and then each AM operation is, In turn, 
expended into a sequence of target Instructions. The opttaaUty of tho reaultant 
code is largely a function of how closely the AM and the terget machine correspond 
and how much Work Is expended on the expansion. 

The first AM waa UNCOL (untveraal computer oriented /anguage) [Steel]. 
Introduced to solve the "mxn transistor" p r obl e m . Its proponents hoped that the 
uae of a common base language would reduce the number of modules needed to 
tranelate m languages to n machines from mxn to m*n; they would translate s 
program In one of the m languagee to UNCOL and then tranelate the UNCOL program 
to one of the n machines. The "UN* in "UNCOL" waa their undoing aa It proved 
exceedingly difficult to Incorporate en the features of existing language* and 
mschines Into the prMtives of a aingle language. By Netting the scope of the AM 
to a class of languagee and machines [Coleman, Walts], it waa possible to achieve 
truly portable software with s mhttmum of effort. Current haplemehtstions f sli Into 
two categories: 

(a) The expansion Is guided by s description of the target machine 

accom m o da te a different machine; how e v er, dye to the toea of 
information awing tbp trsflsiation to ^( p jpttip .,,»^ y d|ffipm--to, . 
use special features of the target l ierew o ro to advantage. The 
- deserjpttan tsoguaga Is oaperaHy t ah jep * to* a ep a otfs dsss of- .. 
mschines and cannot essay be augmented. 

'*->.■.' 

(b) The expansion Is done by a program designed to produce highly 
optimized, code for a apeefflc tar g et mmpsjaa Ifr^riMj . h,W+ and is 
achieved via a "simulation" of the AM operations to gather euffictent 
information about the ,,*M^.mi&m^m*mmr--m**x'to**---tooii- 
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Ml a rOOOf^ OVB pnaBO WW NMM ajyfW OfNRiy to 



Tims, tho d oal oa ar had to ohooao b itOMM a oM a vjnn , a ttarttad macWna 
Indepbndonea at tha ooat of poor a p t to tiatia i t or prpducton aotoaizod code and 
bwoatkig a aubatawdaJ adJbrt tor —oil now taraat waphlna. 

tn on oHort to aoobaaaadata a wider doss of hmmMooo toon encompassed 
by a •tool* AM, aowe reeeercnefo Jjgmm, matt} hevo need a nora general 
maeMna-daaerlptiQR faettty auob aa that pirndded ey^ta* jjlaf| Aa 18? description 
provide* a low-teveJ ({.a., reatotor transfer), htejdy detosed deao rlptt on of tha 
tareet machine which la anenaMa to waoh aaio aJ I n to r p rot etkat to ataadato tha 
described pfooaaaor. If an »SP deaeriptkMi of aouroo taiwpmua operations la alao 
av««abls, a aopMatieetod ooda ooneretor would have eaSWant tntonaattofi to 
oomptote a translation. O ea ntto too auao a a a of WP In as ao rl b lng procsssors 
[Berbeccl], It la not rooty auftabta far' dasoriMraj tha aaajpntttn of a high-level 



'SSfltSftOi* 



language: tho level of doto* r aaj dred by tap waned ro a uhro ■■aeli n doacrtpttona for 



many of tho opo rate ra and data typoa of too tanajiaaj la addtoon, reducing tho 
aaatanttaa of too taroot MaoNno and aowtta h totr towaot eoaNaon 

danoialnator rooulta m toa taaa far O bo oaitoa) ; of l h itoi Bi atluN uaod by many 



Tha Introd u ction of aftttato avamaiara EKaato, Laada] haa eouotod racant 
roooarob adto too forawl aiwb i w cb ia alb pid » j i ia> ptootof aft Mptiation. 

to aidjtoaotod by tha addition of 
■ ^"■* , w^ kot^vmiot ^mn^v-^fx^ mpju. maoa anrtDutos 




»e tho 

doscribs 

***** ^W* 1*^ fffb,|f.^ da|to O ba itf parta. 
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The relattonahip between the attrlbutee of one symbol and another la apeclfied by 
"■•mantle rulea" associated wtth each proa^b^ aamftdno the eyntheetzed 
attributM for the n ootawrtn al ■ym bp l on tha left han d etdo^of tha produetio* and 
tha Inharttad attributes for tha wonta ra a n a l a>otoola op d*m right head aide of too 
production. [Naal j, p r e tarrta aavaral prndufdna ayatoej toejmaptart wMt attribute* 
that doeertbe ^formation oo mnwnty op Msa te dJo to o , so ur aa of optkabjatte* (block 
miMbara,. whether ,.4. s tat e m e nt oan ba roaoh s d sw lai axe o u** *! , ate). Tha 
Principal advantage Of aucb production ey*teiM • topt theft* are no dapandanciaa 
in tha forwaitaa on apacWc l a n fl uaja or ■ l i Jdai ■ ■■ ■ ■> «> ■- •ttrtwt* g ra m mar* 
proyjda • general macheaJap for ■nn M m i Mb j g o n a < e iitiaJk»to i mat*uii during the first 
Phase of compilation. H owev er, optl»b»totoa that raoutaa othar than a local 
examination of context ara hard t» aooaavaadatat sssetroettog **•» appropriate 
attribute* can ba oontrtwlal (of. lam b da n a ln i dus ni aam to In [Knuto]). Ftna«y, 
except In trivial eases, trenelation Into * tarsat ■iihlni praoimiii (with tha 
attendant op timiz a tion s) attt ra n j a t npa ■ n oto o r ab a se - op a wr a rh to highly machine 
dependant. 

Attributes have been adoptad by M awpa wa r in hta «orte on p a nor a^zlno the 
optimization atra teg h a a iap l oy sd by trta BUW to 

performing the expenalen Into POP- 1 1 codey tola eossN^tiftpaeds haavdy on table* 
which oontajn b a nd oo mpllo d brtarmaWnn en toe b oat o h s toU tot each expansion. 
Nowo o mo r attempts to automata tha production of thaaa tablet by examining a 
dsscrtption of the target machine. He ueee a HPfJ Mrs mm dh technique baaed on 



a difference operator to exhaustively aaarnh pSM i ria H s ti a ulkin sequences - from 
this eearcti (guided by, « .preferred at tri b ute apt hd tie to mm frj I Ci e u la tha machine 
description) he o ol le cts toe I nf or ma ti o n n ee d e d to c<metruet fee table*. The 
machine . deecription la %. eat «f ■- i mut e wl aaied toj e b mnei e rmaUufw where tha 
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ep arnprl a lp oofrta«t to eotoa*»*«* trrough the uee of attrtHitee. Although the 
raeulto jatMbda waifc-do*#ttt * ; 0)atoliihto fh^^ QonetructlnQ • 

of « meeMao do ta rlptl a B to a v e ta a hli uunb l wiuoii to tot t/hH ayataro. 

: to otto at ooMetmctlng • modular cods 
>^ ■•«#*/■ eom p t i to OpU ii teaUiit repartoira te the 

vWO| OOMfNMf doWimped 0t «BM* [HeJflaOnj. The 

etrootore of toe 6P0 ooapfter to stoder to that p i a p uMd by thf th— to: there is on 
Mtermeolate l angu or oth aam weed aa toe internet ro pioo owtotton , a tot of defining 
proeedoree that serve a* too boot* for tranaiaMngVoxpondtng program* into 
peoudo-meehtaa leagues*, and a aw a yaat which we ttoM fho tottomal rapraaantatton 
aa eptfeMzattane and aj e p an at an t ato apptad. fho axpeftatone and opthalzattbna aro 
Itoratad until too tra nj l ahtoi la oemptotoi a feel phase trenetatea the reeuftant 
program Into maeMno taafaaoo, p o rrom mt g ro p j ator Mttonmatitt, ate. Tho QPO 
compHsr l« oriented toward* PL/Hko programa - tha primitives providod In tho 
mtermedlsts taaguago direct* support btock structure, Pl/r pdMtor temanttes, otc. 
Tho tot of doiwJng procedures ohm* tstorlna of oode dependant on attributes of 
tho oparanda. Tha main atff t t t na aa bo tw a an tha af© compter and It/ML ara 



o the took of aa nn u rttuatod nemo w e ntg a mawt fa», ov o rtaymo, sHssino) 
on tho part of tho f#0 



a tho syntax of dofMng preeaduraa of tho #JPO oompMar ara baat 
auitad far PL/Hate i 



a thara la no notion of a sa ma anu ad ju ta n t a tttamenta Into a elngje 
oparatkm (aa in pa apholo o pto sUaa on ). Although HerHeon tafca of 
oomph** pant to* machine totortoo*, miatdiatojni take ptaoo on a 
atatamanMyot at a m a nt haala (La., thara ia no ganaral pattorn 
Mtohlng feetttyX ■ ;-«* ■■•■■■ 



a in tha QPO oompher* at h tooto a ara toested law tin/ othar variable - 
optimization* aueh aa oonatont propagation ara rattad upon to make 
tha attribute Infeim au on a v a tab la throughotit tho program. H7ML 
providea a aaparato a am ant l ea for •ttrlbutaa thereby **tonatln v 
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certain situations where the optimization* would not be able to 
unravel a complicated eequence of atatementa. 

The complexity of the 9PO compiler la greatly reduced from that of ourrant PL/I 

optimbimg compilers. [Carter] has hand-simulated tha axpanakm of ttat caaaa 

using a sat of simple defining proeeduree for tha aubstrlng oparator of PL/I, 

producing coda which aquala or battara that of tha ISM optimizing oompJer (which 

Include* sons 8000 atatementa to traat apodal eaeee of substring). Tha Inclusion 

of mora sopMstioated optimizations in tha prooaaaor (of. [Schatz]) ehoukJ further 

Improve thoae etatietioa. Encouragingly, many of these raautts eeem apfHoable to 

■ ■ . ■ • 

the formaM—i p ro p os e d in tMa thesis - tha increased generality of njm, should not 
reduce Its parformanca In this area. 



51.5 OatMna of rem ain in g chapter* 

Chapter 2 ia a detaNad daaeription of the fnternedlete Isnguegs IL: the 
syntax of IL to defined and the reprceentation of data ia a t eoussad . The semantics 
of each IL construct Is described and rotated to tile needs of ML and the 
metaintsrprster. The ohaptar concludes with a brief introduction to tht complle- 
time oaloulatlon of veJuee. 

Chapter 3 dtecuaaaa the construction of a transformation from ML templates 
thst specify Its context and affect The syntax of a tamptata (daaeription of an IL 
program fragment) la des cribed emphastolng the utility of wad cards and built-in 
functions. Rules for applying the tranaformatlon and updating tha N. program are 
given. The final section describes a few sample t r anaf or matiom . 

Chapter 4 presents a sat of sample transformations and oimulites their 
application by the metaJntarpreter to a aampla IL program. TMa detalad sxample is 
aimed at demonstrating the ease of constructing a transformation catalogue and 
feasibility of performing code generation using the IL/ML eyotem. 
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The final chapter briefly discusses the metainterpreter and the facilities It 
should provide then summarizes the results of this work and suggests directions for 
further research. 



20. 
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CHAPTEBTWO 

$2.1 The in ter m e d i ate lanfuags» IL 

The Intermediate language described In tiiie chapter serves as foundation 
for a specification constructed as outlinad In $i:i. °\& supports a skeletal semantics 
common to al program* free 'source 'to machtea lana^aoa; tMs includes primitives to 
describe the flow of control and the managing of namea and values within sn IL 
program. In addition, IL Includes a meohanlam for accumulating information on 
particular operations and storage cede for later use by the transformation 
catalogue and the metalnterpreter. The remainder of the semantics of sn IL 
program (e.g., the meaning ef operations) resale in tiw'treiiafofwation catalogue and 
are made svaflabte when these transformations are app*edl»y the metalnterpreter. 
By relegating the language and machine d e p e n de nce to the transformation 
catalogue and providing a general syntactic meohanlam for accumulating Information, 
IL becomes a suitable intaimediata language for the entire translation process. In 
order to allow c ommon cede generation operation (flow analysis, compHe-time 
calculation of values) to be aubaumad by tiie metalnterpreter, separate fields are 
provided In each IL statement for the Information reeujred by the metalnterpreter in 
performing Its analysis. 

Although IL In Its most general form has a rsther skeletal semantics and Is s 
suitable Intermediate language for a wide variety of source languages, certain 
conventions are astsbaahei below for use In susmriiss In later sections. Most of 



these conventions Were Inspired by conventional sequentiaL algebraic languages 
such as ALGOL, BLISS, or even CLU that are amenable to efhetant interpretation by 
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conventional machine architectures ((.«., those traditionally thought of m compilod 
languages). These conventions wW bo Inappropriate In part for compiled languages 
that ara not related to AL0OL (a.g., LISP); In many oasaa thaaa can ba aaaKy 
accommodated by relatively i lmpl s ehangaa. No direct attention has been paid to 
tha apodal problems a aa oa l oted with the translation of . trip** languages whose 
control structure differs s ubsta n tia */ from that of ALGOL (a^ 3N0B0L, DYNAMO, 
SIMULA, etc); titta c m li sl o n reflects the Maa of this research towards the 
specification of conventional oods generators. Hopefully, further work will fu this 
gap. 

The most oommon form of mtermedtats ro proa a ntatton Is a flow graph of 
bas/c blocks where each baetc block Is d es crib e d by a dlracted acyclic graph or 
dag (see, for example, Chapter 12 of [Mio771ty). JL tea linearization of this 
graphic representation with aeveral addNJoneJ reatrlctions to attow easy modeling of 
conventional languages. An IL atatamant may specify one of two actions: the 
conditional transfer of control to another st ate m ent (these correspond to the arcs 
of the flow graph); or the appReetion of an o p erator to Its operands (these 
correspond to the Interior nodes of a dag), optionally saving the result in a named 
ce//. Similarly, an IL statement may have one of two effects: transfer of control or 
the change in tha value of one or more eefta. As we wm see below, It la easy to 
determine the exact effect of a statement from ita syntactic form; targets of 
transfers of control and tha eat of ceNs changed by a atatamant (its MH set) ara 
syntactical diatinguMiabla from other portions of an IL statement 

As mentioned above, H. provides a schematic representation which is flexible 
enough to be used for programs varying in level from source to machine language. 
To encompaaa such a variety of programa, H. could not (and does not) have much In 
the way of buttt-tn ssmsnties. The following list aummartees tile primitive concepts 
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Of IL: 



• conditional tranafar of control to another IL Matemant In the 
abeence of a tranafar of control execution proapnda ••ouantialJy 
through .the. il program* n , ■>-*.-■■ ^^ 



• application of; an operator to Its operands. Jliafp^are^nQ bu«t-»ta 
operations aupportad by H_ - tha . doalgnsr must ensure that each 
operator ©an^be interjjeeted by the tacgat e>*£j*>e or. further . . 
expand**) in the tranafonaatkm catalogue.- 

• va/ue storage provided by named cells. The acopa of a : ce# naap and 
tha extant of ita etorage oovar the entire R. program. Note that 
there, la r*> djs^j^ ba^ae* prpg^^^ compiler 
temporaries - ajr rea j wrew e nta for value atoraoe must be met by 
using c«Jla. Cot references ,ba%te, an JjpJ§ e /r vefcjo jfpmantl o i almiler 
to BCPL or BLISS, the name of a catf ■ e rv a a aa Ma rvslue; applying 
the content* operator to tha lvalue, pfj A £•>#•** <** !** » irfeWa tha 
rvalue of that de*. Aggregate data such aa erraye or etructuree may 
ba modeled by itructurlnfl tha lvalue and rvalue of* call, 

a attrlbut— for both Ivejuee and mtutmW"**- •h#^ntao*lc.me«baniem 
for acoumulatJng "dedared" information that w unaffected by 
suoseouwri jL.opeCAOone. n in^pv,^|PA i ;Cfl,g^»^m>u»a:.Pw»weps. •*«• 
same cepaMfty for each statement tat an IL program. 

• literal* 1M the duel role of reserved words (operatora, attribute 
names, <|to.) and qonetwt rvshiaa (n iad ta^ oJ»ira / ;> t r atrifiga, etc,)- 
The meaning of a tfteral la "ceft-conteiried," one need go no further 

. than the aUtemant ta «*** % app ea r^ jo , p sJsj i jBjh/ lp meaning. •. Note 
that mare la no aueh thing aa a Hterai lvalue, Le., en lvalue whoae 
meaning can be ee j a» hl a l> e <l I nrfa p endm^ of the context in which tt . 
appears - thus ft (c not legal to apply the c o nt en ts operator to a 
moral. 

The following sections describe each of these areas Jo more detaU, dteeoaaing how 
popular conoepta aueh aa bwck strootura, data typ a> i P*o> ..if. ha n d ed l»rJL, 

§2.2 Date inN. 

AH data atorage m IL ia provided by named oeRa - program variable*, 

Intermedlet* results, etc. are represented m an IL program by a oeH. Each call haa 

three components: 

(1) an rvalue (name) which unambiguously Identifies the ceM. The acopa 
of the NhMU* covers that enflfe 4L PflOaraw. An H/alue oan be 
structured for modeling arrays, structures, etc. 
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(2) en rvalue {Written <*a/we» whteh It aodffted whenever the c«ll 
named toato* Is HMd to bold the ream* of ah op ej rjrtm i. Any quantity 
— a et ata * with the (Ml that east be mo a n e r by oft H. operation to 
considered to be part of Umi rvehie; if moni than en* such Quantity 
e**ts, both the rvafs* and the h/sfcie mwstbW structured. ' 



(3) * set of attributes a a aotiat ed with eRHer the Wakd or rvalue. 
Attributes are ueed for decl ar at i ve tolaf iiaW frit, onea established, 
Is unaffected by subeeeuent ML op e rati o ns - att r f ades ara sort of a 



Mote that no automatic translation to provided by th« aeteJnterprster for calls; the 
designer la r eepo nelBf e for reciting each oaf Uttfeed In tha IL program (by 
Incorporate appropriate tr anaformaU on e tft titf bai wfo rs m tfon oaUtogue). This 
may ktotude allocating main storage ffor program verm*^), assigning registers (for 
short-lived Intermediate roeults), or s u b sumi ng them completely (for tnta mediate 
roaulta computed st oompfta thaa Or mtamaty by the target wachtae - e.g., Indexed 
addressing). 

Although an lvalue un a nfttguouely idontHws a coll, It te not necessarily 
unique. A' given ben may come to have mora than on* rmsm through 'redundant 
axprcastan semina tion or the ALIA* p aa u d o dpo raH b n (see J2.3.3). From then on 
either name may be used Intercha ng ea b l y . The ALMS p oou d o o peration may also be 
ueed to Impl e ment the overlaying of atoraga, an operatic* provided In many source 
languagea *y aeowing the oq ufrel a iKmuj of haatee. thm^ f<b*T*AW, howew, each 
alias muat be made explicitly - this Is explored further m §2.2.2. Note that an 
lvalue may be ueed aa an operand and that, as an operand, It will require 
declaration of attrlbutea ammar to thoee for an rvalue (type, length, value, etc.) - 
care muat be taken so as not to oonfuae hsefcte attributes with rvalue attributes 
and vice versa. 

There la no separata prevJeftJn for the scoping of fvatues fJHock structure). 
Through a declaration of a variable of the aame name m an Inner block, scoping 
sNows shielding of s cell from use Inside that block. In practice, however, 
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procedure caNa and polntara allow access to cede which aro not cHroctly 
accessible aa operands. Thua tha original oa« oannot ba "forflo^an" cowpJataiy 
whtta proooaainfl tha (mar Mock: a mechanlem must be provided for reTersnclnfl 
both tha now cell and shielded o.ll whan daacribinfl tha effact of computations 
within tha Inner Mock. Tha other Inform ation provided by scope rules -lifetime 
Information - Is more accurately determined , by ttva variable analysis performed by 
the metalnterpreter. The additional cede provided for by scope rujee can be 
created by choosing dhterent ceH namea for each nam declaration of tiM ratable 
(perhapa by suffixing the linear Mock number to the variable nama). ^ 

simply objects. If the source language has decterad ■ ftyjpae* , these , may be 
Incorporated as attribute, of the rvajue^ jtj^tfjfggg, 4*$f, IjVfP" *t»* .>***# 
Information la another component of the rvalue). Tr ans for ma tions can utilize these, 
attributes to tailor the generated cooe (sea Figure j^ aajgar 9%$$*$^$$!$*, 
for other properties of rvalues: their al?e, precision, etc. Jn,jtheorv^da|s : types 
provide additional Information In stronc^ typed J^gu s gs t. r For axampla, «ssl8«ji»e.nt 
through an Integer pointer should afreet onry oells whoea rvalues hays ^e^jntijgojr,. 
In practice, aliasing (see above), leek of type checking to computing pointer values, 

■■*■-■ ■*-•' ■--<■■->■ s ,, i,-: ..K *:-%*■-■:■■■' ■■■■■•■ * *- ■. . . 

and, (legal) inconsistencies between actual anj| fore>al prooadvrp parametaf^ 
oonapire to prevent the daslgnar from taking advantafl. of tide ^addjtijjnat 
Information. In other words, Just because the Pfjnjbjr has been declared •« Integer , 
pointer does not guarantee that H^ points to only ca|f of. tyj>« Intogar. It Is worth 
noting here that the metointerpreter does know about oalilnclauu Of objects, 
such as numbers, aMowkig transformations to m a n i pu late certain rvalues at compile 
time. 
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12.2.1 

Attributes provide a genera] mooharasm for as so ciating Infoi-mation with 
o dm ponont e (soft* and ststements) of an M. program. Attributes wiocHtxl with 
the lvalue or rvatuo of • est provide I nfo r mati o n which w unaffected by IL 
operations, e.a., rte typo, storage oteaa, aba, eta. This Information is Inttially 
provided by the first phaae of ths compiler or aMMl during translation by 
transformations as ft is ' dboowtd, ' Ones ostsblshsd, eefl attribute* are 
available froM soy point In too H. program - dynamic Information that Is context 
dapendeftt (s.o., which iog h rtoi contains the current veJee of tho coH) camxrf be 



stored ss «n attribute*. Atti lb ut s s ars the work horse of • specfflcatton: they 
provide s symbol tsMo facBKy for oseh declared variable sad Intermediate result, 
model synthesized sad Inh e rited attributes used for peseino contextual Information 
about the operation tree, end so on stf /otfA^um. 

Statement attributes slew Information not relevant to the result cefl fa be'* • 



associated with each statement. Thle Includes p r o por tion of the operator (e.g., 
oommutath/tty, state of a target maoMne Instruction), effects on the global stats of 
the Interpreter fe-g., which eoadNfcwt codes ere changed by a target maoMne 
operator), progress made In trenejatlng the ■ tst sme nt (useful for communication 
between a set of transformations), etc. By Incorporating' these ptoses of 
Information as attributes, tr a nsfot s mti o ns can tabor the H. program tsfcinfl into 
consideration maeNne* and t s ngi i sgo dspsnd s nt features without buttitng machine 
and language depe n de nci es Into the m s tahrt or of tar . 



t Dynamic information may be stored so part of the rvalue of s cell; in many esses 

complle-tlme computation of rvalues wM irrnpsQmts this luftiiwislhun sa affectively as 

* if it were an attribute. Moreover, much of this type of I n for mation m used for 

opttarizattona which ere sbmedy Incorporated in the 
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Attributes arc referenced In an IL pro-am as follows: 

"sttr/*«*a_ne«s" for statsiasnt attribute*; 
H lvtdum:tttrlbut«jname m far tvslus sttrlbutss; 
m <Nm*>'Mtrl**mjmnr for rvalue attributes. 

Each attributa ha^ a value (atwaya a Utf al^aatalitohart hv some H. statsmsnt by 
Including an sss^nmsnt ,to th* attribute n«ma in tha attribute *sW of that 
statement. For example, tha fe Wtoprtna Jl« proo/em a t a taw a n t #uatraia» ■>>** 
sttrlbutss which wight b« associated wtth the rt s «i i rshon of a * sal yarteWs W-* 
a PASCAL program: 



Label i O M fato r Opsranda 



decierstion 



ex et a mr ****** 



Tha flrat Nna indicates that tha addreaa (lvalue) of Z la a two byto unslonsd 
Integer - this Information wW ba nesded foi;^l||B|^,-i|lia«W|^^##Htora^ by soms 
tranaformatton If Z enters into a pointer asloufctlpn. Tha aaqond Una gjvss tha 
lexical laval and «Uck frame oftaa^ aM^na|l %* (a»f»ar by^ilret phaaa of tha 
eompUar or a transformation applied aarttar); a treneformatton aotiM ba lnoh|dad in 
tha transformation oatalogua to oompute the jbafliej ; mj |sj r sjsja t of r -Z ^from^ thje 
Information. Finally, tha third Una Indicates that tha value of . Z ocoup»ss 8 bytes 
and hss type rssJ. Mots that tha "dscts r a t ta n " operator has no special sljonhTcsncs 
In ILj any semantics sssodstsd with this o p e rator (s.g. f sJooation of storsps or the 
initialization of Z*a rvalue) will be captured In the transformation ottelogue. The 
aame is true for each of the attributes described in Mils paragraph: In IL, their 
vsluss sre simply Htersls - ths Intsrprststlon ssoribed to .them ta the expl*^>n 



t Thess were arbitrarily 
may be s ssods t s d with 
setftat the 



tobe K/etoe «ttrilitea;"iinera1^r^ites of a call 
the lvalue or rvalue - a oonweettan Is chosen here 
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reflects the role they play la traosformsttons sees** by the ejetoJntorpretcr. 

§2.2.2 Structuring of cod imiiiwI wfcm 

The ebttty to structure tvetues (and the* corrosponoJng rvalues) simplifies 
the modeling of aggregate data and operations which affect one or more 
components. Each o amp anent is, in effect a separata lvalue; its typs, size, and 
other attributes oan bo melnteined separately from those of other components. It 
Is slso possible to perform operations on the aggregate data as a whole, changing 
all components in one operation. A component's (value la oonstn»t»d by appending 
the appropriate selector to the lvalue of the aggregate, Hke so: 
sgoreoate _nsm s. s sA toto f, For exempts, If A were en array dimensioned from 1 to 
10 then 



<A> the entire array 

<A>.2 A(tX the second oomponent of A 

<A>.<!> Aft} the 1th oompMipiU of A 

<A>.« a» oompencnto of A (A.1 through A.TQ) 

**• *** <*W9*^ i* m B.ntmmir > is eoutvslent to <ae*r**»ie^sme>.ss#sctor - 
either form may be used intofohengaaMy. In the last Mne, ■»• was Introduced ss a 
convenient abbreviation for <•** possible component nsmee." Of course, ■■■ is 
never actually expanded but rather serves ee a wed esrd when resolving sttrtbuts 
references to componento of en aggregate oei. For example. <A>.« would be used 
when referring coHecttvery to elements of the errey. as when dedsring the type of 
the elements (assuming A is homogeneous). Thus, if a program contained the 
definition <A>.^type-booleen toon the sttrtbuts reference <A>.3:type could be 
rwolved to "boolean.- <A>.- used « the ,rsf* of *„ eit,*^, ^ ^p^. „ ^ 
equivalent to <A>: attributes for an waste jsji ma l ntofr .d separately ftom 
those of Its components. Ths following IL statement illustrates the sttrtbutss which 
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might be aasodsted with a declaration of tha abova array: 



Label 


Operator Operands 


- - ' <Wm " : '" " ' 


A 


daolaration 


<A>.»:tyrfboslsan <A>.*:aJza-1 



Note that the example epea h lM that the rvalue of A Wm away 10 bytee long and 
that the lvalue of A la * 2-byta wnalomtf Intagar (juat Mca any othaf addressf). 
Tha third Una If included since Aifeeund la Many «o> be i*ed a* an operand ht 
subscript calculations and therefore needs the appropriate sttilbetea. The fthsl *ne 
Indicates tha type and etoa of tha oom pe nhii h ) of tha>. array. Iw o hooaln a tha 
attribute* to be inckidad In this array daotovadan, every effort has bean made to 
ensure thai each quantity waton might appear ee an operand In subsequent 
operation* has the reaujred attrlbutaa. The> aMwIimn the need far any aaecW 
casing - a muWpJy operation performed durirtg a subscript o al ou i e lia n receives tha 
same treatment aa any swhipJy operation. 

In many cases the ** u notation la aora powerful than the ootreepondlng 
expansion. For example, consider tha decleratlo* given above and ths attribute 
reference <A>.<l>:type (the type of ttw I th oowpqnarrt of A). Tha *art Una of the 
dacJaratton Indicates that tha typo of awy> exponent is "boolean" and so 
<A>.0>:type can be r aa aht ai^ to "b oo ts an " without hirthar ado. If, on tite other 
hand, separate type de finitions had been prpvkled for each c o mpone n t - La., 
<A>.1:typs t> o o le »n , etc. - re a ohrtton of <A>.<l>:type ooofd not proceed without 
more knowledge of <r> (the value of the subscript). Evan though bounds checking 
msy be desirable, Km batter accdrnpRahad expftcftly at rUff time rather than 
implicitly during compNe-tbM type checking. Another solution would be to endow 
the metilhterpreter with special knowledge concerning attributes of array 
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subscripts, but torn lesd* to imdesirabis lang u ag e dop sndon cie* In the 
metsJnterpreter. AN In aft, toe *"• notation oa mos much closer to the semantics 
cowmen to most aggregate data and leads to a s l mpts mechanization of attribute 
resolution. 

Operations which affect thm rva l ue of on eggregete oaf, fe.e>, w\ array' 
assignment to <A» are undantooa 1 to change too r valu e s of the co m p onents (e.g., 
<A>.1, <A>.2, .... <A>.10). The oonvsr ss is emo true* a oh a ng a m a component's 
rvalue ch an ge s the rvehte of the aggregate. Both eases are based on the premise 
that the rvalue of an aggrsgato m too "cum* of Us esaiityiieirti - tm h that the 
rvalue of as aggrogats Is set stttamammtf seperetety from the rvalues of its 
components. Thus <A> is eouh/etent to «*J>i* (when ■peaking of rv a l ues - this 
differs from the conclus i on reached above for the m«iagtnii of attributes). The 
offset of this reaeonbig (see d m mis sio n In S&*-1 on augmsotstton of UN sots) 
coincides with common prsctics: s change in <<U^g shouht bi yaWdat s any temporary 
copies of the w h ato array «A» but ahouM not affect t e mporary copies of other 
components (e.g., <A>.7); on the other hand, ch ang e s in the whois array should 
Invalidate t e mpor ary copies of any oomponent . 

As a final example of a structured ce«, eormhter toe following series of II 
statements (see $2.3.3 for a do t a M a d d es cription of the AUA8 pss ud o-o pe r a tton): 



Label 


Operator Operands 




X 




■ %4 «3sjBJfc^B^BfcseMsesaihaaaej^B\ - faikgksaeatmbsi If •^bAw^Mas^S 

<X>ttope1mig <X>:stas«4 


I 


ALIAS X.1 


'**ys^^*^i'™^^S^g^Wr^m^aj|gB^eie^ajaaBjir ■•wggfcsai at 

<i>ityfbitegsr <l>urizs>2 


J 


ALIAS >L2 


J:tyrs*<mslBjieiUitag»i Jnrice-2 
<J>^ype-totoser <J>:stea«2 



In this example, the rvalues of I and J overlay toe rvalue of X (toe designer hss 
the responsibility for making the storage allocated for I and J overlay toe storage 
for X In the final translation by adding appropriate transformations to toe 
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catalogue). Note that although X I* not expHcltty declared to have any 
components, aliasing I and J to X,1 and XJ2 has caused them to become 
components of X. Thus, using the reasoning of the preceding paragraph: 

(1) changes to ths rvalue of X Invalidate the rvalues of I and J; 

(2) changes to the rvalue of I Invalidate* the rvalue of X, but does not 
afteotthe rvalue of 4 and 

(3) changes to the rvalue of J Invalidate* the rvaiee of X, but does not 
affect the rvalue of I. 

The final two conditions show that I and J ars understood to be disjoint. These 

three conditions srs Just the semantic* one associates with overtcyed storage 1 .. 

§2.3 Ths syntax of IL 

An IL program Is s sequence of statements mesa up of tokens classed as 

■ ■■.■ ; :."■•■■ ,,...■ '!.".. ■.-.■■ '=■ ■ ■■•" S^t, '?'' -s'i^S ,s5' ■"■■■".-** f- ■ -- : ' 

Hterala, lvalues (the name of a ceil), or rvalues (the application of ths contsnts 
operator to an lvalue). Depending on where a tafcen apbs e rs m an N. statement, rt 
Is further otasstflsd as a Jebel, operator, operand, or attribute. Label tokens moat 
be lvalues; operator and attribute token* are <mmfr &**&*&&**'&*** ■*/' 
be any Savor. Beyond ttm aeaienttoe a aaoelat ed wrth theSo four classes of tokens, 
IL providee no further lutoip— tat lon of oi d b iery tokens, m thia sen**, IL Is similar 
tea BNFt neither pssvtdss any fartorpretrtfcw of the symbols of the language. 
Special tokens are provided to Indicate tr ansf e rs of eontreVend tr^elr cOrreaponeTng 
targets within en IL program. Thess tokens are used in data ftoW ansrysl* and are 
seldom referenced cHroetry by the user. Aw& st a tomsrtt hs s tiWfalteun^^form? 



t No provision haa been made to ehow how to oomputs new values of I and J from 
a new vaJua of X (and vie* versa), 'ymMH^^ Mm>mm«$t(l^ -^^^' °» 
storage allocation and machine r ap r sss ntotSns and so shook* be relegated to the 
transformation catalogue. Such transformations can be generated at compile-tlme 
from the ALIA8 statement through ths uss of transformation macro* (see §3.?). 
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Label 


*^^^f^WB» WPVTVfaM 


AttHbliteM 


toft* 




*•• 



where the components am do s orth a d below. 

/•be/ TMe Held names the celts whose rvalues might ba changed by this 

sta t e m e nt Two labels, -» and a, have a asocial moaning to the 

(see Aaetieaf.&Q&';K<< 



operator This fleW indicetee the operation performed ay th» statement. 

operand... Zero or more operands need aa ■ r g um s nts to the preceding operation. 



ettr/boto... A set of aero or mere w n s m o av a hj o » pabe further describing the 
context and semantic* of the statement 

Figure 2.1 shows the Initial IL representation of the following program: 

sTOWfJSr AtTyeli 

If X>Y than { X«2| Y-3 > eJaa { X«3i Y*2 }; 
2«X«Y? 

There ia no single IL ro p raadntaihjn for e ofe/en progja m i <-C one could ebminato 
the definition of C1 and C2 entirely from Flows 2.1 snd wee Hie literals "2" and 
"3" directly. Choices as to the numoof of leveb of mdtrsotton, etc. are net 
dictated by IL and can be made en tiie heals of oo mp ati hHny with the 
transformation catalogue, appropriateitees for the tetget awshmev esc. Mole that In 
Flgurs 2.1 attributes have only been erven for the eaotafatton portion of the 
program - the remainder wJU be fitted in by the metaamarprntar aa It appHss 
trsnsformsittons. The Initial attributes are atoHar to tr«oM that imght be provide by 
the first phase of the compiler. Attribute* are desedbad In mere detail to|2^-1. 

In the description which foeawe, it wM be useful characterize tokens as 
either literals or references (either en lvalue or rvalue). Sy way of example, 
consider life follow i ng two lines from Figure 2.2: * ' ' . 
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Label 


Operator 


Operands 


Attributes 


X 


declaration 




X:type=integer X:slze-2 
<X>:type«lnteger <X>:sl2e=2 


Y 


declaration 




Y:type>lnteger Y:slze»2 
<Y>:type«lnteger <Y>:size*2 


Z 


declaration 




Z:type«lnteger Z:size=2 
<Z>:type*lnteger <Z>:slze=2 


C1 


constant 


MOM 


<C1 >:type=lnteger 


C2 


constant 


"3" 


<C2> :type»lnteger 


T1 


greater.than 


<X> <Y> 




-» 


if_goto 


<T1> L2 L1 




• 


label 


L1 




X 


store 


<C2> 




Y 


store 


<C1> 




-♦ 


goto 


L3 




• 


label 


L2 




X 


store 


<C1> 




Y 


store 


<C2> 




• 


label 


L3 




T2 


add 


<X> <Y> 




Z 


store 


<T2> 





Figure 2.1: Initial IL representation 



Label 


Operator Operands 


Attributes 


T100 
T1 


equal <X>:type "Integer" 
add <X> <Y> 





The italicized tokens are literals; the rest, references. In IL, literals are nothing 
more than character strings - Interpretation of these strings Is provided by the 
transformation catalogue and the metalnterpreter. References "refer" to values 
established by other statements - they provide a level of indirection. The principal 
difference between literals and references Is that the meaning of a literal can be 
established at compile time whereas references often refer to values that are not 
known until execution time. Literals are of central importance during optimization 
since their fixed semantics provide opportunities for compile time evaluation of 
operations. Some references (e.g., <X>:type) may, depending on the context In 
which they appear, refer to literals; In these cases it Is advantageous to remove 
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toe unmow ery level of twd t fctto ti at compile time, ftr thoee references that 
cannot be resolved hito «efe»s at compile tone {•.0., <X», tt wfH be rwotmry to 
produce code HHWl a ctually peifonw the Indirection specttled In tha IL program 
(••0., by i»*rtonatiis « fftiSH <#^p ttw etorege toPatlon uMd to bold toe desired 
value). 

§2.3.1 Thelebel flaM 

The label fleW of an IL statement tats ftp eete which tra affected by 

execution of that etata m a nt A atotoment NMy affect • c«M In two wayaj 

t ooN it A/tfed by • stat em ent If exeoutton of tho eto%mj»nt might 
eauee tho rvalue of too eel to oil oot ft* tined cote to 

called the A/tt eat 

a can is oefnod by a statement If exe<Mlen o| too statement a/ways 
cbangea too rvalue of the oeNi too ftet of a omiad eete. It eased too 
defoed est of toe a ff ia nt . Noto toot too doteod oot 6 too left aot 
for any atotOJMMt 

Whan a catt la kftod, Its rvahie can no longer bo uaod for calculating common 

atAoxpraaatona (■ ■■ uaitej toot -too «a» had fiot boon kftad pravtetwly). If a con Is 

defined by ft. state m ent , tt w» ahvaya rastoln toe value nmlmilslcd by too 

•Utement after tha »t at o m o Btf a ax n oueqa. Th e refore, If a statement executed 

s ub a aq o on tiy to. identoled aft ss rf e asin g too s s w s onmstif tow. It can be replead 

by a reference to tha deftned oett, mo reov e r, If toe detned value too JttereL 

subsequent references tp tha rvalue of toft dotaod ee« ce* be resolved to that 

Htoral. By convention, too Jwstes of each affected oeM is Hated 4n too tobof ferid; 

the , UnpHcit contents operator la oattf d for ^ the eftbe of brevity. The label fluid to 

uaod by the metabrtefpretor In two Important o pt toi aat tone: am d undftnt compute** 

elimination and uae-deflnttkm chaining ( oompU a time evnJeetton of statements). 

With one exception, the kft oot provide* a* toe information needed to 

perform these optbUzationa. TNa suggests two formats for the label field: "K" and 
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"K,D" where K Is the kM set of the ststement and ths corresponding denned sat 
(DC K.). When the sbbravmtad first format is used* D ia ceteuJatsd a* foJowa». -; 

odes f. if K » empty CfKf* O) then D «*. 

oa*s2. rf K haa a aingia element QKj ■ 1) than D ■ K. 

csss 3. If |K| > 4 then • a. 

\ oasa 4. If K ■ {«} than D ■ *. 

\ 
Considering only statements that affect a^jsoat Qfjp, oa« (aj, the statement* In 

Figure 2.1 faN into thta . category), there Is a natural tateroratetton for a«ah of the 

above cases. Stateatento asserting rm cslis (s^ 

by case 1, Statements whose owstom-hava. afi ; appacattva semanttca (add, 

multiply, etc,) faN under esse 2; tha siiwle element of the kf set la the lvalue of 

the call where the result Is stored. Tha speorOed cefi Is a/way» changed by 

•xecuthia the •tstemant, so D ■ K. Thta> l« also tha case ,to,,#sstgnmanr 

statements which always change the same aeSO^e^ they do not co eu ^rte Its 

lvalue) - in these statements the label Is eseentlatty another operand. Caaa # 

covers assignment statement that compute the, lvalue of tha call In which tha 

result la to be pieced* e.g., assignmenta through aoh»tBfi| l oj , ..to^ array, s lp mi nt a with 

non-constant subscripts. Hers^ oach cell In K has baan^ ld»>d (h* prevtot** rvalua 

may have been changed, thus It can no longer be assumed that It la available) 

however no calL In JC has, been da*p>d (no slnjte been 

changed) hence, = a. In the final case, ,*, label of "f, vindicates that aU calls 

might be effected by exacutJr^ the statement. For o sssnt ialty tha seme reasons. 

given in $2.2, no provision has been made .£& speoJettidnfl "■" by spaoihe cell 

sttrlbutes (eg., type): In almost every language there exist tooaholes whioh make 

attribute Information unreHablst. TMs label Is used? when the r sto*ha%sn* -^bajh 



t This attribute mformatton w/W ba used In tf^ ; a*«ranskw 0* operstkxw hi ttw tt. 
program. Despite the suspect nature of attribute vshiea. thai Is the semantics 
provided by many languages and reNed upon by programmers to cicumvent certain 
language restrictions. .. However, Ms, Irrformet^ cannot be u»ad -as #=ibaejfcp*e« 
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unfsthomeble side e f fects , for example, when the label fhHd contains too complex 
•n expression (s.g., deadly nestsd contents spsutora) - when an lvalue 
subsxpreaelon he* b ecome , unudetdiy It » atwaye legal to sssums #H vatua la """ 
and procaad from thora. TMa ovarly oonaorvatlva Interpretetlon may result In 
missed optimization opportontttea but naWr »n an lmx>rrs<rt translation. 

ProeedttfO calm have the potential of sfreetlng many cells and so do not fall 
Into the categories discussed above. The aoquonoe of atst s m s n ta which form the 
body of the procedure may ktt and doth* ceHb - taken In the •ogregate it is 
possible that K a ft # +. In addition, procedures that return a value add yet another 
element to D (the eeR c on tah U nq the returned vatas). The second label format, 
M K,0«, la used for procedure caU*. Whts It ta theoretically posslWs to compute the 
appropriate label by e x a min i n g the body of fee proc e du re, this calculation quickly 
becomes unwIeWry. A reaaoneWe eJternetrvc » W assign procedurs calls the label 
H *,R« where R la the lvalue of the eat In which the returned value (tf any) Is 
stored. Thus the ••mantles of a procedure oaf Is reduced to invalidating 
previously calculated values for et CeNs except the one contalnlnfl the return 
vshie. 

As waa outlined In $2.2.2, it » occasionally necessary to augment the kill set 
of s statement to account for the sswwnttcs of agorsgat s cess. Although the size 
of the kHI set mey be increased, the deflned set catcuJated above remains 
unchanged - essentially no new osmt are being added to the km set, but only ofher 
lvalues for the effected rvalues), the objective of augmenting the kW set Is to 
explicitly Include the lvalue of every osll which is affected by the statement; this 
reduces the amount of oomputstten performed by the mete l rfta r p fter when using 

optimization*, ss Is would lead to Incorrsctty transformed programs - only the 
programmer Is aJlowed to play havoc with his program! 
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the WH Mt. 

The following algorithm conetruots an augmented kill sat K' from tha original 
kill sat K. K' wW inckida all lvalues ALtASad to lvalues In K as wall as tha lvalue* 
of aggregatec which aubeume tvejue* in K. jn, construptlngK'.^ distinction la made, 
batwaan an aggragata and Ita componentat If an aggregate nana appears in K\ It 
rafars to tha aggragata treated as a alngla Value O.a.; any temporary coplas of the 
ehUre aggregate should be Invalidated); If temporary dbplas of an aggregate's 
components should also b* Invalidated, the '»*r natation W umm). For example, *A" 
would invalidate any copies of the array A but (save tts components unaffected; 
"A." would Invalidate any components r'(aM subcomponents, ate.) of A. The 
algorithm Is '''•'■'*"' -' 

1. Initially K * K. 

2. For each etruetured lvalue a In K, add «." to K\ An lvalue la 
structured H 'any |***<f^.hf^jM i^tttS r jP?^ MP & 

ivlua cxm^mmnt^ ^^^g^^^^^S^ 



value was (h tha original 
Invalidated In the augmented kiH sat. 

3. For each lvalue a In £, add any ajlaaaa dae|a|ed fo| « to K'. 

4. For each component lvalue * J In K*. add a to R\ , The, Meat her* Is to 
add alt the prefixes fdr each component lvalue, e.g., If A. 1.2. 3 were 

6. Repeat «tepa 3 and 4 until rw iiwb addlttoo* are made to IC. 
The final result for K* la ..the. augmented WH ser/or tbe, aUtament. The foHpwJna, 
series of example* should clarify the woridrnMj o^,^ 

exhibition, duplicate lvalues (e.g., X/> and X. I) have been removed from the kW 
sets. The examples assume the declaration a!ven»n examples In S2.2^. 

original m gtmntvt ^-" 

Mil set (K) kill Ml (IC) 

{Ay •■■■"'= !***■*>■"'-•■- 

{A.3 A.4> {A.3 A.4 A) 
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fAv*> {A»» A) 

} <l X.1 X> 

;x> |* X#- * J) 



8 



Not* that the eijgmsntsd M* gats agree *Wi the dsildirsts outlined In $2.2.2. 



§2.3.2 The 

Mo particular i sm s niica la attache to the operator fiefd of a statement. 
The meaning of an operator ta oa tsbHshe d by treosformaflsna which axpan4 it into 
other IL or target machine o peratione . A tiss^ snalogy for an IL operator w a 
macro - the body of the macro define* the effect of an apsrater In terms of, other, 
usually simpler, operationa. If the effect of the macro can be aocompKsrted directly 
by the target machine no further refinement of the operation Is neceeeary; the 
translation of the statement is complete., Otherwissvthe body of the macro (in this 
case a aeocencs of IL operations) should be substRiitsd far the o p eration , making 
the appropriate aub e mu ttons of actual nporsnde for formal parameters of the 
macro. If each expansion w subject to later npWmJi«tlmi, It Is possible to use 
genersl definitions for each macro operation, We., ds finrouna such so one would find 
In sn interpreter. Special cases that Nnge on particular values of the operands 
would be expNcroy tested for tat the substltutsd ssauonco; wter optimization would 
eliminate those operations whteh oouW be performed at oompfle ttme, rV example 
(see Figure 2.2), the expansion of the addition operator mtght test the type of Its 
operands and then perform sn Integer or floating point sddroort as appropriate. If 
the type of the operands eoufet be eetatiftshed at corneas time, this tset would be 
subsumed during optimization. Although tt Is not necessary, ues of general 
definitions greatly slmpttflaa the tap level of s spsmtu sU un ss there wW be only 
one transformation for sn operation rattier than one for each special eaee. 
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Label 


Operator Operands 


Attribute* 


X 

" Y : * ! 
T1 


declaration 

plus <x> <Y> 





Figure 2.2m: Original IL pfoaraw 



Label 



X 
Y 

nop 

.,■♦... 

a 
T101 

•♦ 

T1 



T102 



iPliiTr tBL 



°mt"i» 



***** 

labal 
equal 
***** 

add 
goto 



float 



T1 


«#4f 


-* 


goto 


■ a , 


labal 


T103 


aqual 


«♦ 


If-goto 


a 


labal 


T1 


addf 


-♦ 


goto 


a 


labal 


T104 


iltefHtr.-"--.-' 


T1 


Addf 


a 


Mbai 



<X>jtype *l 

?a«K* v i* la- 

L1 

<Y>:typa "Integer* 

L2 . . . •- ■ 

<X> <Y> 

L7 

L8 .;,."■ aft; 

<X> 

crtoax <y> 

L7 

■L4-...--.. 

<Y>;typa "real" 

<T103> L5 L6 

L6 

<X> <Y> 

L7 

LO 

<Y> 

<X> <T104> 

L7 -.,.,*.. 



litttBSHIUBlm 



XJPiojfBje'dBtagpr 



Figure 2.2b: It program with expanded definition of plus 



■*:« 



Label 



-•*,*-•■ 

Y 

TT02 

T1 



Opar ator Operanda 

- 'L'l l i i LI ■ i . lug. i 



<*o*r«W ^ W* ft 

declaration 

*slf ■- '-06 " v ' ! 

addf <T102> <V> 



«r*? 



AtMbutea 
<*>Hy^4r,t*fler 



*.t* *'?>? 



Figure 2.2c: "OptMaedf it 
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OHfW» aervo ao araiwionti to the pra pad l nfl o po r a t too and may be arty of 
the fotteWMg: 



• »****/. Utarala are oaoioaad ^m *io*ae when they appear in the 
operand field ao teat they may 1m ifiatlnpiloltert from -twnhioe. 



llm1tf.9Jm1t~Jt J* -piiilllH % referen ces 
etc. AH ettilaufce iifipaniml* dhuulil be able 

to to feamtawdvOt ^emanifto time (J.e^ there rt<iW'*a^^ia|*«^p0ate 
defianton generated at 'oema - paMfc '•ih''' > tee axpaMWa of' tha IL 
program). If no mash ieftdllan e«mto then the *■ irOnee is 



a refe re nc e o xpr e aeki a; a ataaria hr nfcia ; If tea M aJMlB|P:Of tea can 
ia naadadt olliaiaaaa, an rvalue exproaelBn (which inly %a netted) to 



tmi 



There la no a priori restriction on tea cowirt axHy of a rafa ww xt a expreeeion, but 
Mora than ona level of i nd i recti on ( o e ntonta operator} wft tkery have to ba 
caiculatad 4n a aaparete atetoai e nt By co nvoate m, ot moot a s tng ia level of 
indirection ia uood in an op a rand . 



§2.3.3 The EPJO and ALIAS 

Pseudo-operations provide a meebarrteie <fer Uifprw i nir tea iaataintarpreter 
about Infofmatton dtfllcurt (or Jnpoaattdft) to^ftrlvo from the IL program. IL 
statement* wltt peeudo-operator* ere "vtajbta 11 to tea transformations which may 
transform team into ordinary H. atatamente, ate. but they baeoata "InvWbla" in tha 
final tranatatloff (M;., thoy at* sot output ihHia raaumng target macMne program). 
Tha names chosen for p oo u d o o perations w roaarv tW not ba uaad for 

othar purposes by the designer; In this theets, paomJo upo i aton WW bs displayed in 
upper oaaa and all othar ope rator* displayed In lower case. 

Tha statement in wMeh tha END paaudo-operateM) appaara mark* tha logical 
end of en H. atatamant eeejuenee - flow analysis for teat sequence wHI not 
proceed past tew statement. Statements following ten atatamant up to tha next 
target atatamant (aae $2.4) are oonatdered toao o o a e wj n and will be removed by 
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the metalnterprater. The END pseudo-operation is intended for use at the end of 
the U. program and for meriting the end of procedure bodies wttWn the IL program; 
presumably soim transformation wilt translate it Jrtto • exit or return as appropriate. 
Thla operation makes no uae of the label, operandi or attribute fteWs an* «© may be 
used ea the operator of a target statement. ^ 

The, ALIAS pseudo-operation provides the capability of defining equivalence 
claaaea of lvalues - any member of an eqtdvslsnce dass fefers to the aame rvalue 
(although each member may have different attributes associated wrth It). This 
operation k used to Indicate sharing of rvalue* (overteytog of storsge) •*' declared 
by the aouree language program Ce.g., with the «»TRAN EQLHVAUE«£ statement) 
or aa determined in same transformation (a.o., when used to indicate that two csHs 
Md the aame value; this typically occurs during optimtsetton when a sequence of 
statements bona down to a move from one cett to s temporary ~ the ALIAS 
operation would indicate that the te mpora ry ts a »a« a d with the original cell), in toe 
latter case, the ALIAS operation provides a renaming capability to * the 
transformation designer. The form of the ALIAS ststement is 



Label 



M0o»< 



Operator Operands 



ALJAS /vatoeg 



Attributes 



*^^WP.*™^'^^^WRFTff ; 



which oauaes the metsinterpreter to place lvalue^ In the eame_name equivalence 
class ss /va/us 2 . Note that, by definition, ALIAS tea trar^al|ye;ofer|^on. Ty^plcally 
lvalue ^ la the new lvalue to be defined and attributes... are Its Initial attributes. 

§2.4 Flow of control In an IL program 

In the previous s ec ti ons , the syntax and semantics of a single IL statement 
were described; tola section describes the seaisntics of a sequence of IL 
ststements. IL statements are executed sequentially, module expl i cit transfers of 
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oontrol. Thfe etaeeie central structure wee cHbhh bioMii of its compatfcUfty with 
the oentro* eferactura pwaaojad by meet target ? maehiwao ~ toe oparatloiw prtohfoe 
to ILart m m la r to tho aa provided at the maeNn* leva*. boauenfJat execution to 
also compatible with * wide vertety of lewgoogaoy sapaihriiy these that have 
retatrvely eever* ordering eonatraJats (•♦, Ataot, whteb ap e mWsu strict left-to- 
right evetaattatt of enaveaaione). TWa control atraatHra -'to tiara eeiwtreJnJng} ftian 
the ona provided ay ma dags en whtoh a. wee mualiilj the arty c o n afe ra i nt Hwpoaad 
by dege la that tba aona (operands) of aw b i toito wade ( ops rst to nj must be 
evstastod before ma. node ea* b* avafciaOaU. teaaj tana^aajaa (a.a, BLfSS) taka 
edvanteoe of tMa ftsxmdttv 'In esmsmaatanl aMamsbOn fev aato Intaeetna evaluation 
order constraints «n pertain oparatara (such as P£OJtbi..PWe). Such flexibility to not 
Inherent In an tt. program and most be provided by too tr ana l u w aauo w cateloajua and 
ma (aataintararatari li an a faimsrfcmo oan ohaage the opdar o* e tate m errta In aw- IL 
program'. . 

Aa was mentioned at the b ao ji mbig ot thhr ohaptar , the ayntactfc 
conventions dfacuaaad balaw ara not p articu l arly aaja foprtate for languages whose 
control atruotura diff er s aubatanttaHy from that abova. 8W08OL, for example, 
requires a "tranafer of control" with every statement - the difficulty In 
accommodating fJtta construct In IL reflect* tha dttRouftles in producing a SNOBOL 
compiler for conventional maoMnee; perhapa when the latter problem has been 
aoivad, the sohrtton can be Inoorporatad In IL. 



t In goners! theee trsrwformstions only change the evaluation order to achieve 
some goal, for exssipis, a reduction in the number <>f wsgbJtors required to evaluate 
the operator. Hi tMa way tha conditions under which evaluation order con be 
reodWad and what metoica am aaed to Jedaw th* raaett era made axpftett in the 
transformation catalogue, thm inf o rmation would be ueeful during the analysis 
phase of a metacoaateer attempting to cxmstruct a cette genarstor from • the' 
specification. 
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IL statements which cause a transfer of control (transfer statements) are 
readily Identified: they havs a ■*■ In their label Held. Note that this use of the 
label field prevents the statement from also comi (and saving) a value, it can 

only effect a transfer of control. Procedure oaia { a|p handled differently: since 
control returns to the- state me n t fo tt o wt na the p r oced u re can, they are similar to 
ordinary statements axeapt tor the pass** s kh » effeet a of the procedure body. In 
$2.3.1, a convention for the label fleW forpnaoeauretcais saw eetabhehed Casting 
the side effects of the procedure); . thus, no transfer m expftottty indicated. 
Procedures- are. treated ae •teosajlsxi nperaJinse : hi gabwlojk as ties section fat 
concerned Note thejt a transfer statement jjpaje toonsfora control; If execution 
can conditionally continue with the nsjd stat e m e nt , ft meat be provided for 
explicitly by adoa^ an additional icbsi ate t e ment 

IL statements which ,are , Iwraejta for a tranafer of control (target 

sUtamants) - ami' Identified by pjadmj a ^^^limfe^fabafc < tjdaV . M for tranafer 

statements, target statements osnn e t . cwmpu i t a (endsavsi) fc^mlus/islne* their iatoel 

field baa been preempted. Th* fohowine^i »ivafit to e> b> used rby ibhe vamrtsJntsrpfeeter 

for ; determining which Ui««t alwtemeate sis^fmNamts targets >fer.« given transfer 

statement: . ■ 

a target statement is a target for a given transfer statement iff the 
same lvalue appears as acme opeiamtjdt bjoth thoit eiy o t s s atsa isn t 
and the trensfer statement 

. ■*■..-.. Ata-c, -■■ .=.,**<# -'. -■.■■- ■'.-' ■ ..- -■? jfe* ;■ -. ■%■-. '...■■' j-.» i ■■■..■ <r .;..-,■ ■:,)' >i's'--r' V; ', 

Thie convention aMowe additional arguments to trensfer and target statements 
which can be used by the operator of theee atatementa. The following example 
(extracted from Figure 2.2b) illustrates the convention mors clearly: 
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*0 * l<? *» »»»f "' 



£^Mto). 



<T1fi©> L1 JL| 
U 

14 



' ■ ti i i n 



The first Mr* is a 

COn RM ^BO Of BF'Off 

S 



to toM from toe 



(*e* * In to tobef field) Whtoh can transfer 
His ■ s o awtf ■ lalowstrt W recogntesti « 
1.1 Is «n Ope**** Of both 'toe *Jtt en* second 
mc wear MMMii it w nor pOSSHMO 
nPNnpHF" 'iMw mm viuMr " mn is 
on tno se»antkM of toft tf^Oto iJOWsttun (siW prosuwsbiy 
the valuo of T100), Inf orma t io n that onfy sxisto *> ths traiw^ w at^n <artatooua. 

TMa toforemtton is used by too wsts l ntoipfOtoi to oeristruct • "WSxiwaP How 
graph for toe «. preerask Tito o*w graph to totfosai to toe softse that a* possible 
«r* oorwiaerad tor each fr s h i fii i ■ ttosoo i it. WW two— which my be 
out by too ■ss m ii t l uii of too operator of too Oiiwf a r siirtow a iU . TMa graph 
for too How Mtolyooi pwfw wi si U» trw» rtstoJmairpretor and to 
updated whenever a transformation changes or ototootoo a transfer sta tso j oitfc 

S2.6 P e ntos e tono c alculati on of r v s fn 

Ono of the goala for too syntax of N. Is to stow too cowpHa t i ns calculation 
of rvalue*. This Motion brte#y t s oohos on too raotHution of rvsHio s (and lvalues) 
using too notation developed Mi oarfar section*. In toot section, set notation Is 
usod to Indicate posstolo vskios for s isfoiowos sunrotston, e»e,, If too rvalue of I 
la known to bo either a or 4 toon wo wrtto 0> • {« 4}. if too value of a 
roforonco expression is unknown (l.s., It could bs any poootMs value) toon wo 
write {*}. 
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Occasionally, it Is possible to further resolve a particular reference 
expression. If <l> ■ {8 4} then « 

<*>.<!>■ <A>.{3 4} • <CA>.« <AX4}. 
If, on the other hand, the value of J Is unknown (l6l»»-(K^*ie* 

<A>.<I> « <A>.{-} - <A>.«. 
IL recognize* the alternative f owns M each siumipfs arnj eeuteatent: m effect, such 
resolution Is performed automatically. Even-in the abs e nc e of taewlodge about the 
rvalue of I, a reasonable Interpretation of ivanies Incorporating «> ». peestbiev 
erring only in thst It is Ukely to be an overly conaervativ* Interpretation. In the 
second example above, the distinction between "*" as an abbreviation for all 
possible component names and {*} as the representation for aN possible values has 
been deliberately blurred. The Intent behind assigning numeric selectors for the 

i 

components of the array A Is to allow this sort of feftcttous confusion. 

As a rule of thumb, the utility of the oompHe-time computation of a cell's 
rvalue Is inversely proportional to the size of the value set. There are several 
contributing factors: as the size of the value set Increases, It becomes 
Increasingly unlikely that any significant optimizations vritl be possible for rvalue 
operations on that ceM. In addition, uncertainty in one eel's rvalue tends to 
propagate to other cells whenever the first oeH is used as an operand (the value 
set of an operation la proportional to the produot of the value sets of the 
operands). Such "dilution" of compile- time Information is not unexpected - it would 
be unreasonable to expeot to perform all computations at oompHe time! However, 
the prognosis at this point Is not encouraging: it would appear that large amounts 
of compHs-tlme Information could be coNeoted with little prospect of a 
corresponding gain in the optimallty of the rewriting tranalation. 
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Tha pracedins paragraph prompts two obsarvationai uoMoMoilMii oafeulation 
of rvalues is aubjact to tha taw Of dhrir o ihlno «•»«•• «** thoreforo rvaluos are 
not suitable for coN attrtbutas that do not chariga wlti» saofc <*>a«rttan on the eon. 
The first observation sarvae as ^DwOpt aio^iwlieoiior the Introduction of {"} for 
rvalue sots wtricn haa^ g ro ws too gMa bi mnm i Tbe saoomt auuoaata that 
•ttrlbutaa ai» a^uaa^ li rtlt t aw to «ha aawantint a# a oad » > aai aii thay aadtdde a 
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$3.1 Tlie transfOTnation catalogue 

A major design goal for the IL/ML system was to keep knowledge about the 
source language and target machine separate from general knowledge about code 
generation, this was accomplfahsd by providing for* a separate description of 
machine- and language-dependant semantics - the embodiment of thia description Is 
the transformation catalogue. Belch piece of language- or maehine-apecific 
Information la expreaaad aa a ayntaotic tranaformation of an IL program fragment; 
after the tranaformation has been applied, the updated program wMI have been 
roodrned to Incorporate this hew information In terms the metelnterpreter 
understands: as attributes or a new aequence of IL statements. The 
metathterpreter provides the remainder of the framework needed to Uniah the task 
of code generation: whenever It exhausts its anah/ais of the current program it 
returns to the transformation catalogue to gather additional Information (in the form 
of a "new" H. program to analyze), thia cycle of analysis and tranaformation 
repeat* untH the translation Is complete. 

Thia chapter discusses the transformation catalogue and the language which 
serves aa tta basis: a metalanguage (ML) for describing IL program frsgmsnts. 
Using ML, tiis designer can write templates which describe the class of IL 
statements in which he Is interested. This class can bt quite large (e.g., "all IL 
statements which have commutative operators") or quite small (e.g., "only 
statements which apply the sine operator to the argument 3.14160") dep an <flng on 
the application the designer has In mfnU Members of the class of IL frsgmsnts 
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described by • template mrm eeJd to mutch the template. §3.2 presenta • detailed 
description of the ayntax of ML. 

Two templates ere In corpora t ed In oaeh t ranafomatlon : one as a pattern, the 
other as a rea /acsw o rrt . The pattern specifies the context of the transformation ao 
a set of program fr ag mo nta on whtah file transformation can oo^rafet IL 
statements) which match the pattern become candidates :> for the modifications 
spectlled by the replacement The ropta c em enV perhaps ttrtng, statements or 
components matched by the pattern, teas how to ojmit^wft a new IL proa/am 
fragment to be eubelftifted for the matched fr agment . 

The use of tr anafo r met to na la a w el l e stablished tecbntoua for embodying 
knowledge for later use in a mechanized fashion (aee S1.S.1). If a* the contextual 
Information in the data base (In thta case, the H. program) l« avaMebte In ayrjtactlc 
form, patterns provide a concise description of where the piece of Information 
captured by the transformation m apoHcabta. Using the transformation catalogue Is 
reduced to fttdtng a transformation whioh matched the given IL statement (or any IL 
statement, If the metainterpreter has no specific goal In mind); alternatively, the 
replacement (which la alao a pattern) can be examined to determine if It 
accomplishes the desired effect. Trie abHlty to uae transformations from either end 
enhances their utWty as the basis for knowledge representation. 

13.8 describes how trartaformatiohe are constructed and how they are uaed 
by the metsinterpreter. The final section of this chapter presents a aeries of 
annotated example transformations. 



t This context can be further inodHled by s set of conditions specifying 
constraints wrrtch are not express*** m terms of the syntax of the IL program 
(aae $3.3.1). 
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§3.2 ML: a languego for describing IL program fragments 

ML la almjlar te other metalanguages -i Its eynta*; subsumes that of IL (I.e., 
an IL statement is a legal ML statement} enaV Mo mMWon^ It allows certain 
metsaymboja to replace IL components or sta t em ent s. The matasymbot* cow In 
two flavors: wHd cares that act a* "oon't oerea" m the matching process, and calls 
to huHt-ln functions that altew access to seme ef tap m e ta j ntmepr gtsr's knowledge 
of IL program semantics. Ufa of those- mata s ym h ow pafmtts the doajgner to writ* 
generalized IL program fragments; theee fragmonta are mora general than an IL 
prograin fragment because ths dsstgner ,has oonstralnsd oary those s t a t e m e n t 
components In which ha 1* Interested (using wid csrds to specify the remaining 
components), 

However, the d i Marm can pah/ paa a ra Jl aa aisag cartairv dhi i sws k >na as hts 
only sccess to the meaning of an IL statement m »ts syntaotioi form and whatever 
built-in functions are available (see $3.2.2). Sines the separate Aside for kill sets 
and attributes In an IL statement seem to be as far as one can go towards making 
the syntactic form of an IL statement reflect the etatemanfs semantics without 
limiting the generality of IL, the limiting factors are the capabilities of the built-in 
functions. The designer can determine whether two literals are the aame but may 
not be able to And out, for example, whether the square root of a literal la an 
integer. These restrictions on the abMttSee of butWn functions are the moat severe 
limitation of ML: building In language- and machme-epeelflc predicates Into ML is 
ruled out as this effects the generaltty of the system i lately, It would 

be Impossible to Include ell the generally useful functions. Last we be accused of 
making a mountain out of a molehill, It should be- pointed out that the reault of these 
limitation* is missed optimization opportunities. Ifrs»iiamhly . aM- the computations 
specified In the IL program could be dona at e*eoutien tima* the computational 
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fftcttttto* proUdm* ky ML «ra Intande* -to ana* a e a a tat tattering of tha 
tranafnriaalinna andnotta Im atraaaaatiaf aaaaMnajit'-af' tMa traftafarmatlona. ML 

WWt— W WOW W wjf |»fPwamnp-aaB*W-TUnB*Oima jar MiflpiWCHw 'Or RTarafs ana 

for lntwi»r«t»«a Utonto •• nwwirki quairtttJaa - c#«W fWW«an» muat ba comrtruct«d 
frnm man a try mnamimi 'Hal aaoaariata ti tnaltamatrona In tfia c ntalOQiia These 

irvpm wiwunt **y^ e»»wpeFeaepp^nT ■ www aejejejpa'Wf^*we>ww- . eaae«w*^»p*'3a*iia^Hnifl»weT . **«■ aawip. ap**i%iaar*#»a^»»» ■ iw»i»w 

eddtttona to the oatawgu o am «ttleei.jrrt for moot aorpooe* - for example, the 
c^taloge* may ueriiatn tow o nJima lt WM 1w mm^mi^ ilm mf^om^on of »%9 
tungdono to a a rt al n ara umanta (», r/2, e«e.) tart would trenatate ait 

»n*HiejMk n^fcflrf ikA MBAiMEMte^K ifuHctfiiin '■'■'''"'■ 

$3.2-1 describee wild carde; §3.2.2 enumeretee aema example buRt-fn 
functions. £ wmn. « lit atata wa nf oan be foand »» the lee* section of the chapter 



$3.2.1 WMd earda 

WIW card m e tasymbo l a ara uaad aa components of an ML statement 
wherever a apaoMe ML o om p o n on t would ba tea raatrWWiw - the wild card wW 
match any IL oomponant(a). The cMcusston below aa acrtfc a s tha moaning of ML 
statements whan uaad In a pattern; to a targe d e gr ee tha aamantics of a 
replacement am t h e na r (dhfaranoaa ara doocrtbod k» §3.3.2). There ara four forma 
of wNd card: 



m tm" *m . ii i i xfy B-m tm 



afeaAflM A Aattate 1L ^atfottfca^BeM& 

a"* news a aaauanoa af IkmiaaaaMMnte- 

nam* 4a an option* ^ datu m ai whteh w o— if to r l mflncp . lah between multiple wttd 
cords used *n a alnata pattern or repta oaaont . Thaaa rtamas are fttso uaad m tha 
replacement to rafar to oomp en aa ta or statement* matched in the pattern. If a 

60. Chaptsr Three - ML: a language lor daaerlMRf «L program fragments 



.'"■ ::" ': -..-.\V - - . -•- "V- ' 



given wHd card appears more than once In a pattern or replacement (I.e., two or 
more wild carde with the aame form and name) they are u nd e rst o od to represent 
the aame IL oomponentr If tNo du pl ic ation occurs wh^;:ftpa1Jem-. then all the 
copies must match IL eoatponento wfth the same ro 

The ? and $ wttd cards match a alngte, nofwiuB component or etatfiment 
respectively, I.e., for each ? <$) there must be a cojrespond^ IL component 
(statement) in the IL program fraomjMfrt ,whi©Ji.ta\ J§eta that when 

describing an IL statement, aH of Its components (with the exception of ;ajtr»bu|es, 
aee S3.3.2) must be accounted for In the ML a^e^ajaao^- .e|^ : jxp|crtly or as 
wild cards - or the match wH) f *H. Thue, If only the label flekt la to be constrained 
in the pattern, wild card components must be used for the contents of the operator 
(use ? wild card) and operand (use ?" wNd card) flelda. 

The ?* wild card matoheej|ny^eequj|BC^^ 

wtthln a single field - what components ara matched usually depends on the 

components on either side of the ?■ wUd card In th% ML ststement. If these 

adjacent components constrain the match for the ?* wUd card to a single 

sequence, the ?* wild csrd Is said to be unamo/guQua, hi general, If more than one 

?* wild card la used m a single flatd, they may be amb l guwn ; this la alwaya the 

caae if two ?• wild oarda are adjacent or separated by any number of ? wild 

cards. Even If specific IL components are Interposed, duplication of this component 

in the IL field can cause the ?« wild csrd* to be ambiguous. For example, consider 

the sequence of components "A BCC D". There are two ways In which 

components can be assigned to die ML expression "?x 7*y C ?*z": 

?x*"A" ?V*"B" ?*z*"C D" or ?x«"A" ?*y**B C" ?"z»»D". 

Ambiguous wild cards sre useful for matching a specrec «. component enywhere In a 

r ■ 
field; e.g., the following ML statement matchea any add atatement which has at 
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•a* 



UM__dE£S&L 



m**m**mi&i&m**mim*i*?l-Y*Q 



JBBLm 



.2X2552 



«!£. 



niaiiiil ijy.. 




MMbutee. 



If *add" to • ornery opera**, ana of "Popsl and 'l^il wtl be aeeianed no 
com p o nw tU during «HMl4 Hie ?«attrlbutes wM ori Shows the more 
traditional un of una m bigu o u s wed cards to match a whole fleM tor later replication 
In "'the' replacement. 

The f* wM card mat o hs i m soou o nca of are or worn a, tufmnts. Unttke 
7* however, the sequence is not ce teramted by Moot Jufctapoefoon in the H. 
program but by flow of control: > Ul o w>M» are oonsloarod edjacafit »n the process 
of matching If ono might follow tho ether In execution, trsnehs* and jokis in the 
flow of control often result In More then on* imelrti ■■eueno of stetemehts that 
could match e't* wid card. For e x a mpl e , a n nid i the «\ program given in Flour. 
2.1 and the following sa nu o it oo of ML rtat om snla : 



Z 



9*h 



Figure 3.1 shows the two poaatde sequences of NL state m ent* that could be 
matched by **A. In eueh eeeee, both e e qu o ns o a are caved as poaelbte vetoes for 
•"A. The moat common two of •« wid cards (and the sate of statement sequences 
thst they match) is to eatabesh tha context of a tr«n*form«tton - there exist 
DuHtHn functions that test these sequeno e s for simple proportmc (e.g., presence of 
a given lvalue hi the iebel Hetd of at least one state m ent in one of the sequences). 
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Label 


Operator 


Operands 


Attributse 


C1 


;.';OOIMrtWltf-' '' 


:.*I!JH%-'.- -'. -»#i1t**'?' ; 


<C1>ttype«1rrt»ger 


C2 


constant 


»3» 


<C2>:type*integer 


T1 


grsatsr.than 


<X> <Y>- 




•* 


if .goto 


<T1> L2 L1 




■. • ,., 


labe*..v.to 


L1 


i \- f ': ■ 


X 


•tore 


<C2> 




y 


•tot* : 


<C1> 




•♦ 


goto 


L3 




• 


(aba* 


L3 




T2 


add 


<X> <Y> 





Label Opsrator 



Operands 



<cW>:ty^^nUg.r 
<C2>:typs»intager 



CI 


constant 


"2" 


C2 


constant 


•3" 


T1 


orsetsf_tnan 


<X> <Y> 


-♦ 


if_goto 


<T1> L2 M 


• 


label 


L2 


X 


store 


<C1> ;. 


Y 


•tors 


<C2> 


• 


labsl 


L3 


T2 


add 


<X> <Y> 



Figure 3.1: Matchss for $*A from Figure 2.1 

§3.2.2 BuHMn function* 

BuUMn functions ars used In ML statements to psrform operstlons that 
require more power than simply rearranging aw IL statement. Action a built-in 
function baa the following form: 

f(//iot/on[ar0(/ment 1 ,...^rg«/msn( o ] 
Ths uss of square brackets distinguishes btttNn function caNa from ordinary IL 
componsnts (which ars restricted to the use of parentheses). AN functions return 
a result (no side effects ars possible); this result can be used as the argument to 
another built-in function or, if the call wss part of a replacement, become part of an 
IL program. The arguments to a function may be written a» sither IL or ML 
oompononts but thsy must bs able to bs resolved by the metaWterprdter to a 
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particular IL co m p o n e nt (or «_ statement sequence for certain functions). In the 
process of a p p l ying the function to its arguments, the die*** may «6ort causing 
the apptteatfen of the transformation to fafl regerdiese of * tho location of the 
function caN {pattern, replacement, or conditions). The meft* reason for aborting a 
function ia an Inapproprteta argument, e.g., the argument has the wrong type, 
cannot be reafin/ed to a Moral, etc. For Instance, the add function aborts if both 
operanda are not Morals that can be Interpreted aa numeric quantities. 

By way of example, several functions are deecr below; this list is not 
meant to be completa ^* 'only a s am pli ng of each category of function have been 
described. It is expected that an implementation would dtfmnd the flat; the only 
criterion for Including a functton la that tt not eater to a apecMc tanguaga or 
machine. The following argument typoe are used In dstcrtotag function* : 

component Any IL component m an a c ce pt able argument 

titora/ The argument must be an H. literal (I.e., an 

operator, attribute reference, or operand 
enclosed In quotas). 

ftumbmr The argument must bo an 4L ttterai which can be 

Interp reted aa a num ber <Le„ It contains only 

■ .msjps, O' dG*™moiBeejml$:>aiid f * 



boolean The argument muet be one of the IL Mere* 

"true" or "falee-. 

•oquoncm The argument muet be the result of a $* wttd 

card match (im., a set of tL statement 



If the supplied srgument does not have the correct type, the metainterpreter will 

abort the application of the function and hence the application of the 

transformation In which it 

aM$O0oANM,0ao#ew>i 
oriboo/t»nJbool—nl 
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not[Doo/een] 

the standard boolean functions evaluating to the literals "true" or 
"false" as appropriate. These, a#e used ?eies* o«^ hr conjufictlon 
with other functions to form more complicated expressions. 

equai[//teref,//tera/] 

compares two literals to see If the* have the eeme representation) 
evaluates to "true" If they do, "false" otherwise. Note that equal 
cannot be used to compare two art»tesfy IU nnmpeasntt - this can . 
usually be accomplished directly In the pattern by using the same 
wild card name Mi both co mp on e n t tocettoee. 

conatantfcompopent} 

evaluates to "true" if the argument is a HUtral, "false" otherwise. 

h/alue[component] 

evaluates to true If the argument represent a vaHd lvalue. 

label£/abs/^eovence] 

evaluates to "true" If any member of the augmented klH set 
represented , by /ace/ appears , In the label flat* of a statement 
contained in the set of IL statement sequences se q u ence. This 
function determines whether a ceiteJ .fram+jheoan modHtad In an tL 
statement seque nce. The label timjmtlan la rasfsaajrtsthro of 
functions that search H. statement seausnese tor ejeuilo properties; 
other functions that test for properties in every sequence and search 
other statement fields should toe included. 

add[/K/fnoer,fluj>ice/'l 

subtractfnuoiosr^umoar] 

multiply[/M/mber/N/mber] 

divide[nvmber,/w/n6er] 

the stanaard arltiunetic functions returning the appropriate numeric 
literal. In order to avoid repreaentation prob4emst>e precision wnit 
may be set by the implementation. 

power.of_two[m/mber] 

evaluates to "true" If the argument Is a numeric literal which Is a 
.. power of two. "faiss M otherwise. ,^We f»a>otien Is u a ofut -for 
determining when to change multlpiioationa and divisions Into shifts. 
This example represents the tip of the Joesmcg when tt oomes to 
useful arithmetic functions - a reaeonaMe s ubset might be to Include 
only operations on binary represenUtions (blne«y tag, topical and 
arithmetic shifts, etc.). 

Choices of the domain (arguments for which the function wIN not abort) for the 

predicates described above have been made arbitrarily. AH that really matters hi 

that the choice* are consistent with the use of the functions In the transformation 

catalogue. 
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S3.3 Tf 

A transformation * made up of throe components: * pattern, a replacement, 
•nd a sot of cortdftmns. The pattern (art 'III. pH«r«» fragment) «n<i the conditions 
(a set of predicates) establish the context of the transformation by Identifying 
those IL program fragments on wJssh the tr a naf e matto n oan operate. A contiguous 
group of statement* wrtMn the pattern is ooajgnatfd as the targtt - these 
statements must he contiguous as they wJH be reet*osd hi their entirety hy the 
new IL program fragment constructed from the ro pij a s m s ot ones ths context haa 
been verifled*. 

The Tottowing ortterta must bo Met before a transformation can be applied: 



(1) aN components of the pattern must mate* some oomponent m the IL 
Pfogrem fra gm o s t <and vise' versa). MsjMsmlsd wbd cards must have 
matched Ifc omsps s aMs emu the same re p rd ssn ttk m. 



(2) each of the eondtttsns must evaluate to true. W any oondWon aborts 
(sse S3.2.2), the app«c«tlon of the transformation faHs. Note thst 
oondftton* may us* warned *W cards from the pattern as port of an 
argument; these wJM card. wtfi be repl ac ed by 1 oomponsitt(s) 
thsy matched during (1) beforo evslustlen of the function. 

(3) the target must bo a contiguous group of statements from the 
matehed JL program fragment. 



(4) the replacement must bs succsssfuNy constructed - each In-line 
built-in function call must bs evaluated without aborting. 

If all these ortterta ere mot, the newly oonstruotsd replacement is subStfttrtsd for 

ths target, completing the apphsatlon of the transformation. 

Ths foNobrtng section describes ths syntax of s transformstton In mors 

dstail; $3.3.2 outlines how the replacement is constructsd. 



t Statement seo^ences matched by *« wjld os/da , oaanet, in general, bs "•«* 'n • 
target since they do not necessarily contain hudcssV adjacent statements. For 
similar rsasens, •■ wild esrds srs seldom ussd In the speclffeetion of a 
replacement. 
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§3.3.1 The syntax of a transformation 

A transfocmajon hat tha fotowlna form: 

lebot 1 Operator Op a m n da f Attributes 



pattern goisfe'lmiNf' 



rsplacsmsnt goes .ham. 



...conditions go here • 



Tha first aaetion contains tha ML program fragment which san/as as the pattarn, 
the second section contains tr* replacement (ajao an ^nC program fragment), and 
tha final aaetion contains a sat of conditions (tf ho oondWorw am needed, tha final 
section stay be omitted). Target ststsments wttWn tha pattern are Indicated by a 
double varttoal bar to thalr laft. For sxampte: * 



Lsbal 



-»■ 
a 
•* 

c 

a 



Operator 



Operands 



baa 
label 

l*P 
label 

label 



?dsetl--'tnew 1 ' 
?nsxt 
?da«t2 
?dast1 

?dast2 






Attribute 



'_ *Mt HH Mfc^Hi. W .all ll ' Vi'li 

Tpoa n on ^yui ancn _pc 



looation«?c 






e3^srrr^Tr5 1 R2^>i 



In this tra ne form ati on ths first three st ate me n ts of the pattern sm the target and 
wW be replaced by the single statement replace m en t when the tfansformatioo te 
appsed. The rematntag st atements matched by tha pattern (two tebew and the 
Intervening atatemanta) will be unchangad. the Intent of the transformation Is to 
use the abort address form for the hJmp4fg|Bf>^ first 

trims statements if the uttimata d s« tt n attai <9ioest2)^a> net too far, away (less thsn 

;;■ ., .■■■'.. ...... ■'■ , ..-■.-■ ,.,-.■■..-■ £ ?*■'*■*■'■ ■■■•"' '■*"<■■ fV. , - * ^ 

266 bytes), This transformation only .handles forward jamas - snother 
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transformation would be needed to s ccommods to Jump* In ths other direction. 
Other points to note: the use of dupftosta wiM cents te specif y that the same IL 
component most ap pe ar in more then one piece* the first end last statement of the 
matched fragment must neve teeeejsn sttributse. 

With one exception, each ooMponent of ths matched IL program fraoment 
must be subsumed by seme co m ponent of the pattern. Hie contents of the 
attribute field are exempt from thm ooodttten - attributes In ths IL fraoment thst 
sre not named In ths pattern do not enter into the mataWng promt ■. The use of s 
?» wild card to capture the ura*eco1ed attribute* for talmr reettostton In ths 
replacement is not necessary as there are special ruie* oejtceMdng them In 
construction of the repia oem e nt (see 53.3.2). Thus, attributes are tcrgely 
transparsnt to a transformation; the fr fo rmaaw they een*oja r # eutomsttoaUy copied 
to the updated program wherever necessary, flew stt rlbut es may be added to any 
statement or calf by simply I nc l uding ths spproprta si ss s sj iime iH m the replacement, 
hi the exsmpts above, a location attrlbuts la defies* for the nevr *<bhe" ststsment 
with the seme value ss ths location attribute for the original "beq" statement. 

A new rvalue for may be ktotoate etalnterpreter by including 

an assignment to the rvohjo (al m Har to the definWonof en attrlbuts) in the attribute 
field of the appropriate tt s t o m s nt In the reji laaemei i L Por exmmpte, the foHewkig 
transformstlsn replaces the s dd hJ iin of two ooeeteeta «etfc a star* sperefcjon, 
Indicating that the oestinemon of the store hee aeeulred a new wame which Is the 
sum of the constants. 



Operator Operands 



Jest .. 7oe2 



Attributes 



T" , " , rt mnmmt TQ. 



-'■"'--■" : l f ai - ! 



®ijf«*i 



Ths rssult from the oaN to add m the operand field of the replacement wHI be 
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automatically surrounded by quotas (to Indicate that tha new operand is a literal). 
The number built-in function returns "true" If its argument Is a numeric literal; the 

condition could be omitted entirely ee add aborts If Its argumen ts are not numeric 

... . v . ... , 

literals, causing the transformation to faH. Note that tha rules mentioned in the 
previous paragraph will ensure that any attributes deftaej ..for, ?de«t In the original 
statement will be added to the attribute field for the store statement. Finally, it is 
worth pointing out that ?pp1 and 7op2 do not heed to bejrtersls In the original 
program - ?op1 and ?op2 need only be able to be resolved to literals when the 
transformation is applied. For example, the statement "add <X> <Y>" would match 
the pattern if <X> and <Y> were both known to have constant values. These 
values would have been established In previous statements by Including 
assignments to <X> and <Y> in the attribute fields of those statement. 



S3.&2 Constructing the 

Two capabilities are provided by the replecement that have not been 
dlecussed previously: the. generation of new sym b o ls utwssd e ls e w here In the 
program and the automatic handling of attrtbutes. The aWWy to generate an 
unused symbol Is neeeeeary when the tra n sfor ma ti o n expands a; sjngts statement 
Into a series of new statements aa temporary cess used by the new statements 



need to be supplied names that are not used e ls e wh ere In the program. Automatic 
handling of attributes enables tile designer to Ignore attributee with which he is not 
directly concerned and guar ant ee s that no attribute Information wW be tost through 
sn oversight In composing the transformation. 

When expanding the specification of the replacement to arrive at the new 
program fragment an wild cards must be eliminated, If the wild card has the same 
form and name as one which appeared in the pattern, the IL component matched by 
that wild card aervec as Its value in the replacement. For Instance, applying the 
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laet traneformation in the previous section to 



A.1 



mm 
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m m -mm* 



Attributes 



would result In the replscsment 



Label 
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Operator Operands 



anno 



«Q4« 



Attributes 



It a ? wMd card In the replacement doee not correepond to aome wfld card In the 
pattern (i.e., Ita name la different from any used In the pattern), a new lvalue la 
created to be used aa rta value. The new lvalue Is guaranteed to be different from 
any used In the remainder of the II program. Note that the designer must Include 
any attribute* to be associated with the new lvalue aa part of the transformation. 
If there are no wild cards In the pattern that correspond to $, ?*, and ■•■' wMd 
cards In the replacement, the transformation Is HtegeJ and wfH never beeppMed. 

As an example of generated rvalues ooesMer the fosow4n« tr enifbrma tten 
concerned with the expansion of the subscript operstor: 



I Label 



7pb- subscript Tajray ?>ndex 



?t1 
7t2 
TtS 
7ptr 



Operator 



Operands 



oonvert ?lndex 

subtract <7t1> ?array:low*r_bound 

multiply <?t2> <?srrsy>.":sl2s 

sdd <7t3> ?srrsy 



P^ < vffflBTT < * i f^BPf*N L 



Attnbutss 



i\ rsi^^ppisameagjsr 
?t2:tiass«temporary 
<?*g>t typa' «l nto gor , ■ 
TtStds sstsmp o r ary 

-jCStmkibHeMaisesMr' 

■ ■;™^ l ^WP^^P^|p^e!Bjp^*3a«s;^nMw^nnBW - 

«7ptr»;typ*»<7array>.«:type 



The oonvert operator In the first line of the replacement wW coerce th» value of 
the Index to type "Integer" (ess S3.4 for s sample definition of convert), .Ttl, 7t2, 
and 7t3 ere sit new celt* which wM be named when this transformation to applied; 
?ptr, ?erray, and 7lndex will be taken from the subscript statement matched by the 
pattern. Note thet pertinent attributes for the new cells neve been defined in the 
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transformation. The attribute defined In the laet Hne of the replacement Indicates 

that the type of the value pobrted to by ?ptr Is the same ai the type of an 

element in the array being subscripted. 

The following rules ars used In establishing attributes for stetements In the 

replacement: 

1. Every attribute dMKirtfpn In the target t i e le e) e n IS wtfl be copied to 
the attribute field of eome statement te ■, the roplao sms rrt by the 
metalnterpreter when It eppMes the trail ■fuimSOwi. ^ W hei « Pus e Wi l ai 
the statement chosen In the repl ac e men t Heft wilt have the same 
label as the defining statement In the target - into does net make 
any dhtemnoe as far as defMng the attributed cftnosHia*, pel! IT 
Improves the documentation value of 1^ defMtftn^ Jt .tWaW nb 
statements in the replacement (the target la being completely 



•"■^•Wjl 1F&. °*»«r |1kH»W# H* Vm„ve&fa0 ffogjam^ip chosen 
to receive the deflrrttions; • *' ' ^* ** ^^ " " 

2. tf applying a transformation would result In a conflicting attribute 
definition Oe., two or more definitions of the seme attribute with 
different values), the transform* *,-:^>.» --,-..- ^ 

3. Statement attribute* ire never copied to the replacement; only ceH 
attributes are updated. 

Rula 2 ensures thst once defined, attributes pan be counted on to maintain their 

original value (l.e., attribute dsflnlttons are conserved). 

$3.4 Example trsmformetlons 

The first example la a transformation which expands the coercion operator 
used In the sample expansion 6f subscript in fie previous section. ' The convert 
operator coerces Its argument to have the type of deetinatton ceil; It assumes thst 
types sre constrained to be one of "rnteger" 
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It»exp*^£btatf«»tea£lhga*dlb^ For 

example, If ?«rQ:tvps-wte0*r and 7r os urt . tv j)a-i »e i then the replaoamanf ssn be 

reduced to * sf» gfe s ist a meht by aflminaMG cd oomnjlo time 

•valuation of trm tf _gpto wstta*.. Am»ug* tt*. trermforasttoafe lengthy <iu* to 

the lack of any sugar** b> ML for rb aiia lunlUg on the v^um of attribute It was 

straightforward to oonatrvct. Note that this ttanafometfan cannot be applied If 

either ?arg;type or Treaafetype H undefined (equal w* abort). Through the iiae of 

condraono. It would be pen l ife l a to rewrtta tbe atngtb trinsfn ri nenn n sftova ari throe 

separate tr arwf o matt ona, one for each of the cases trael^ ttw amount of 

~*~ ™ ^^^ *»^ »«^^ r^^^^ ^^ e^^^wp . ; ^n^w^ oe Qonwoaraory 

reduced. 

The foNowtng sens* of tran^on^ttoiw de^ «dtt the axfwraJon of the store, 
operator. Unlike the transformation above, theae aocpanaJens most be done In 
separate transformation, because of the uee of the ALMS operator* The first 
tranaformation handtee the eeae where the atora o pe ration can be eliminated 
completely becauae the destination Is a newly defined temporary and the value 



t The ALIAS operator, Nka attributes, provides 
the flow of control; branches cannot prevent * 
Thua, the strategy ueedfer expanding the 



whi c h la Independent of 

of the ettss operation. 

he used. 
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being stored Is already contained In an accessible cell. In tNs case, all that needs 
to be done Is alias the temporary to the cell already containing the value 
(effectively renaming all occurrences of the temporary to use the ceN name). 
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JmssL 



condi ti ons ; andTooysf <?dest>:type,<?source>ttypeIlvaluer?souroe11 



The next two transformations translate the store instruction to the appropriate 
machine ihf traction, depending on the type of the destination. 
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Attributes 



ttype.Tes I*] 



These two transformations "overlap" the first - program fragments matched by the 
first t ransform ation wHI also be matched fey one of the other two transformations. 
It Is up to the metalnterpreter to decide which of the eppHoable transformations to 
apply; presumably the irst transformation witt be u»M, whenever possible toeeause 
of thf N reduced coat of the resulting code. The £|»4iJQ9QfJtaret^ |HK^iMapdates 
store statements whose source and destination have different types. 
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14.1 
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addition and assignment; ths ordsr of expression avaluation Is constrained to bo 
left-to-right (no reordering Is allowed); •« quantities are 1fl4>lt two's complement 
Integers (the same for both the source and machine lenguage). In examining the 
assembly language program, It la apparent that certain conventions have been used 
In the translation: r6 Is used as the focal m pointer, external variables 

are referenced by name, Idea! (automatic) storage for Mocks le allocated from the 
suck and referenced using the tooal stack . fr ep« pamt e r, and so on. These 
conventions are established originally by the designer and Implemented by 
transforations hi a straightforward faahkuv 

Although it la possjblo to kiterpretJvely apply the transformation* and derive 
a translation, the reader should jbe reminded that the main goal of the 
transformations le to be d—orlpdv. Many d the transformations below employ 
attributes snd conditions that represent a ram ta naaJe d eea rfp|h>n of the Information 
and constraints Invoh/ed In a transformation ,y those tHtnafor m atto ne are not the 
moat elegant sx pr aaB ton of the neo ee B a ry ayntaotlc tre naajrmttlnn . in .the final 
analyala, a transformation ahoutd be judged on the Information It conveys and jio* 
how close It comes to "the way It ahoiiid really be-done." 

The approach adopted for the organization of the transformations Is as 
follows: ths Initial IL program le first translated into mstruottone for • stsck 
architecture, then the updated program la translated into target machine 
Instructions. Optimizations exist for oaoJMfu# -of l obwm o dla ta program - sample 
high-level optimizations are deecrtbe* Jn.f$9, ■lyfcjHmfr&em. In §4.1. and 
peephole machine optloUzatlone In $4^2. 

The first group of transformattons deeertbee the process of storage 
allocation. An "afreet" attribute la Introduced for each automatic variable declared 
in the block, giving the variable'* offeet from the base of the local atack frame; the 
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Nghaat offiMt u i » i j la «emd Jo calculating tha storae* to ba afocatad for tha 
block whan It to ontorod. 
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Tho two transformations abova handla daolaratlon procasstng r automatic variables 
ara aasigned offsets, extarnal variables ara declared otobsl. In tha first 
transformation, offsots ara oroDaoatad with tha aid of a oo— nant atatanant that 
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assign "1" 
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assign "2" 
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phis <A> <B> 


T1 :type*temporsry <T1>:type*lnteger 


T2 


plus <T1> ■D* 


T2:typo temporary <T2>:ty***lnteaer 
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assign <T2» 

sxlt PROG:storage 


" v '''.:' : J.!.'-; ."- 




commant 


PR0Q)storags»#4 



Rgura 4.1 : Sample program after declaration ^snsformaHons 

gh/as the current offset The t*stat \wHd oar4 wtM match only statement 
sequenoes that do not contain an "offset" attribute defWtton or "declaration" 
operator In any statement (this restriction Is e mbo J e d m the condition). Note thst 
attributes defined for tlie declared variables wW be automatically copied over to 
some replscement statement (In these cases, there is only one). 
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end Trams 




ii*n#§if*4 



©ftset*?©ff 



tiotCoperatort"daolsratlon" r ratatl]. 



This transformation handles btoek sxK after aM ss o l srati o na have been processed, 
desllocsting storage for the block and defMbjg tMr storage alze attribute 
(?name:storago) for uss during blook sntry. The oondltlon Is similsr to that for 
automatic variable declarations. Figure 4.1 shows the IL program after these 
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The f ot low tag two transformations perform stam^ optimizations on the stack 
machine code generated so far. Both traneformattona i improve on pop/push 
Instruction pairs that have Mentis*) operands: the first transformation eliminates 
pairs whose arguments are temporaries; the second trsssf ormatlon converts pairs 
whoss arguments are veriebtee to a oopy from Vm tPP t»f the stack. Sines 
temporaries were oenefated by the compiler end do sot represent user-visible 
quantities, they may be eJrefoatsd during ctpttrn fratton Ftgure 4.2 shows the 
example iL program after tr anslat i on to stack In s tru o tlsna. 
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$4.2 CosN^Mng past ths machine interface 

In trde section, we dsal wrtt translating stack machine programs to target 
machine programs. The first set of transformations srs a straightforward translation 
of "push", "pop", "oopy". and "add" to P «ruj(tlons; The al*a m bytes 

and number of storage references /reo^m^o^far 1 each machine Instruction ars 
Indicated by the "size" and "refe" attributes respecth/sry. 
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Initial values for the "al»" and »rsf»- attributes do not tsJfe ojiersnds Into account 
- the opersmfs uui i trHwUoua w(N ba Included whan thay arc translated to legal 
assejtrity lafig^Jage amstruets. 

Tha naxt group of transformations trsnsistss Individual operands into tha 
appropriate machine addraaaaa. Recall that r* is used as tha baaa of frame 
pointer and that external operands are addressed by name. 



C 



Label 



Opsrstor Operandi 

tNrtor 7"Wors <?rtad> Rafter 



Trafor tfefore 7ram»:off^t(r6) ffyfo 
conditions: equalT?r>nd:tvi>e.'irt W a«tk ; "l 



Attrtbutss 



1*C«7stos ref»«7rof> 




Jiili ti 



70. 



Chapter Four - Compiling paat the machine interface 



sf^^fw^^-; 



Label Operator 



Operands 



?rator 



?"before <?rend> T»after 



?rator 



?*before ?rand 7*after 



condtttona; SQ^isffirandttyps/sxtsrnai 1 '] 



Attribute* 



steo»?»i*e refs»?refs 



«to«««W[?»tee,"2"] 
refe*eddf?refe. M 2"l 



Label Operator Operands 




conditi ona l con«tantt?randj 



?»betbre and ?*attef.lre smplguoua wM -cords' used to select any component in 
the operand field that has the correct form (spseifJed by the remaining component 
in the pattern's operand field). Mote that the apecrnoal^jjif "size* and "refa" 
attributes in the patterns ensures that the 1 I only be applied to 

machine Instructions, hours 4.3 shows the It program after appMoation of thsas 
trsnsformstions ( unus e d attributes have been s ms i nat c d for brevity). 

The most o b viou s opti mized o pp ort unit y involves s push onto the stack (a 
"mov* instruction with a second arg um e n t of M**)") famvw c d by an instruction that 
popa the stack to pet tts aourca operand (an Instruction wrth a first srgumsnt of 
"(sp)+"). Sines an "sdd" can take too «ama aouros operand* as a "mov" 
Instruction, ths puSrt/pep sequences can be reduced to s single Instruction: 
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Figure 4.4 shows toe effect of tote single optimization. 

Many other machine level optimization* are posstbts at tola point; asveral 
optimizing transformations are Nstsd below. These Include removing superfluous 
zeroes In index expressions, eliminating additions with s zero operand, and 
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Hour* 4.4: Sampto program after poah/pop optimization 
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Figure 4.6 shows the H. program «fter a»»|»»eatlori of thas* flnal trsnaformatlona - 
comment and attributes have been emitted- and ettrfcutc references resolved. 
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Obviously, addltionai transformations * mittji J* hand* optimization 



; however, the bulk of 



opportunities that arise from the translation of other 

: ;..'/.•. el ':• 

the translation can be aceompOshed wMi M 

$4.3 Interacting with the steteinterpreter 

the transformations in the previous section dealt with the translation of the 
input program to a target machine program with little attention to the semantics of 
the initial IL program. For the moat part* the mot a int o rp rotor had only to choose 
which transformations to apply - this task was made fairly simple for, in almost 
every case, If the transformation/a pattern and conditions were met, It was 
appropriate to apply the transformation. This sectkm explores how the capabilities 
of the metainterpreter can be oalled Into play to Improve the oualty of the 
resulting tranalation. 

The flrat example exploits the metainterpreter's ability to perform certain 
computations at oompHe time. Consider the addition of the following transformations 
to the catalogue: 
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"aaalfn" and "phw" operators. Uetng the MhMn of •add" given m §3.2.2, the 
aeoond trtnsta r ws tlon wm mfy succsed If Topi and Top2 are numeric Rterats. By 
extendm* the w otoJwto r proto r to support symbolic computation, both the 
trsn«tor«»atk*» above wouW be useful even for n e nitor al operands (although the 

iheuM net a lbabiato the expeott pies operation unless the 
at eoopfts flee). The p rtoary bonoft of such en extension 
would be s oofr— p ond ing extoaaten in the m o l s bdoipioio/ a abWty to detect 
redttndan* oeewutottans. 

AppVwg these trenefermetkins to the sample program tn the first section, 
toe meteMemreter can acquire toe fuBswliiu r valu e InfOrmafjan; 

<*>^1» <t*-*2« <TT>KO«*8*. 
Aa a reset of this new inferred**, the Mttt program can be modHted as shown to 
Flours 4.0 (update of Phjure 4Jt% T> mmtinj ■ In— Ismiahsii la 
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offset*0 




oomment 




A:type»automatio A:offset>0 
<A>:type«lnteger <A>:slze>2 
offset*2 




global 
comment 


B 


B:type«external 
<B>:typsftntegcr <B>:elze-2 
C:type*automattc C:offset*2 
<C>:type"Jnteger <C>:slze*2 
offset«4 


A 

B 

T1 

T2 

C 


assign 
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aaslgn 
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oomment 


HI II 
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T1 :type«temporary <T1 >:type«lnteger 
T2:type*temporary . <T2>:type«lntsoer 

PR06:atorage>#4 



Figure 4.6: Sample program after declaration transformations 
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ebse*2 refe»1 




•ub 


PHOG:storage epf- 


•iie*4 fef»>2 ' 




global 


B 






«ww 


#1 06) 


•be-4 refs-3 




mov 


#2 B 


stze«6 refs»4 


■ '". 


mew 


#3 2<r6) a 


ttfC fefs»4 




add 


PROG:storage ep 


size>4 refs"2 




oomment 


., ,. .-■ "" ". '■■•f ;"'.7 hi 


PflpaftstorapsHM 



Figure 4.7: Sample program after optimizations of $4.3 

to subsequently unused temporaries, the transformations of §4.2 can produce a 
program Identical to the assembly language program given hi $4.1 (see Figure 4.7). 
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allows it to be rafarancad by the transformations, permitting the translation of 
statements to be tailored hi response to special ^rapertiee of the operands or 
opportunities presented by the context. 

in , Chapter 9» the . ,pm0h>mt^md ^eetoJague » ^elseujssed- and the 
metalanguage in which the Individual trsnsfeimamoAB are wrmen is pressntsd. The 
metalanguage provides the ability to dMore* classes of 1L program fragments, 
leaving statements and eofSKiiieffta ujtspeejf^ wM cards. 

Each transformation contains two ML program fragmento (templat es); a» pattern that, 
along with a set of con d it io ns , speclflas the IL prooeam fr agmenta to wMett the 
transformation can be applied, and a replacement that teds how to construct en 
updated IL program, BuBMn . functions that «Mow sccees to •some of the 
metalnterpreter'a knowiedga about ..ML progrema end perform soms simple 
computattons on literals are provided - toese functions era ussd In constructing 
the replacement and conditlona. The osndWone as s oci ated with a tr enef er m a tion 
specify contextual constraints ttist are not «eteted to #m> syntactic form of the 
matched fragment The wide range of mio>me|lon avaMabm to m tranaformation 
enables toe semantics of code generation to be: ■■ -expressed 5 as step*by-step 
syntactic transformations of the Intaniiediate Jengusgs anagram. 

Chapter A presents a set of example tran sfor m s tions as a speaHlcstion tor 
translating a rudimentary source language to POP IVllka assembly language, to 
suggested In $1.3.1, the transfornwtlona are or^anlwd about the uss of an 
abstract machine (In thla case, with a stack architecture). The Initial translation to 
stack machine Instructions allows several optimizations to ba accomplished that 
would have otherwise been difficult (e.g., the removal of unnecessary temporaries 
Inserted by the first phase of the compiler). Several transformations thst allow the 
metalnterpreter to infer the run time values of the variables and subsequently 
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• translation of attribute references to their corresponding values 
wherever poaatt»le. If any unresolved attribute references remain 
after comfMetton of tip treip^ 
should abort, Indicating an Inconsistent IL program. 



a evaluation of built-in functions, if a function application aborts (e.g., 
beoauae of domain errareX It Is saved ler raewalMstlon later In the ^ 
translation. 

• propagation of rvalue Information. In combination witii data from flow 
. . analysis, It Is possible to i*plao# rvalue ope*ende wrth H*»rals 
representing the known valua of the rvalue. 

a application of a chosen transformation. Information obtained during 
the match of tim pattern Is Inciofpejrjitpri jn the repl ac e me nt 
specification (along with any generated s ymbols ) to create a 
replacement for the tjuoat atit«ajae^ in tiw im^efn. Ow^ tiw 
construction of tile replacement, many of the other bookkeeping 
functions can be perfg,-mad then and tiVere, eMmtnating the need for 
extra passes over the ML program. 

Two other tasks fall In ttrts area: checking for termination conditions and choosing 

which transformation to apply next. 

§1.3.2 outlines how to teH when the translation la complete: a measure of 

the programs optimaHty Is computed using a formula (in this esse, Involving the 

values of attributes associated with every statement} auppfted by the uaar - If the 

calculation aborts because soma statement does sot hsvs tiie appropriate 

attributes, the application of more tiansformstions m catted for; If no more 

transformations are appHcabia, backtracking Is called for. If the measure can be 

computed, It Is used to remember the best translation found to date and the 

metalnterpreter backtracks to find other translations. Backtracking Involves 

undoing the last successful trapafermation and aw 

(repeating for another level If all the applicable transformations have been applied 

at this level). Exhaustive search of the transformatton b-ee can be avotoed If the 

user supplies a "trigger" vakie for tiM measure n- any program whose measure Is 

less than the trigger valua la considered an ao«ep«abM> translation and becomes 

the final output. roan the. transformations are noastruntart hv such a way that the 
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possible whan rvalue Information Is considered. 
There are other related optimizations requiring the- same dataflow Information. 

The required (low analysis could ba done anew at the *ompl*tlon of every 
transformation application but this would be Incredibly Inefficient .~ prohibitive for 
large programs. The bit vector method* «Jt«r»e4 by {Scbatz] sndfUlmsnJ offer an 
efficient representation of the data flow infctiaettan that can be Incrementally 
updated as long as the underlying flow graph J* not changed (except to add/delete 
more atralght-Hne code or loops complete** \4k sd le,th* edded code). Thus, 

the more time consumln fl Jta/ath/e calculation required when the flow graph m not 
known need onry be performed when a trapsfooaaiten affect* the branches and 
Joins of the graph. A large percentage of transformatiens do not affect the graph 
Itaclf - all of the transformations in Chapter ff eoujd be accommodated by 
Incremental analysis. 

In a drffereht vein, code motion out of loops, sBmlnatton of Induction 
variables, etc. (see [Aho77bJ for a large sample) represent other optMzatlona that 
could be tocorporatod lo the metaJnterpretef , , As algejrthms are developed for 
regwter aflocation and optimal ordering of expression execution, these wW also be 
Pri"» candidate* for inclusion. Our shopping Ust can easily grow must faster then 
our ablHty to Implement the aJgorithms effective* within the framework provided by 
the motalntsrpreter. Fortunately, soma tra nsterm atkma are much more important 
that others; the Hst gtven under flow anaryals te * good start towards an excellent 
code generator. 

$6.3 Directions for future research 

Two avenues of research are natural extensions of the work reported here. 
The examples of Chapter* 3 and 4 indicate that much Improvement could bo made 
to the usability of the metalanguage. Many operations commonly performed during 
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program could lead to a vary competent compter that l« easily maintained and 
modrfted to produce code for different target machines. 

Many other Implementation approachea Re further off the beaten path. One 
of the moat interesting » the prospect of creating a "compiled" code generator 
baaed on an analysis of the speotfeatkin. Such compilation would require 
extenalve Information on the Interaction between components of the apeoHlcatlon; 
the metacompiler would have to -underateno? fee .fleet of each transtomatton in a 
much more fundamental I way than H needed fre^ 

Compfflha the •pacification would eliminate much of toe aearchlno and backtracking 
described m the beginning of $6.2 with toe reault of a vaat Improvement In the 
performance of the code generator. Thj matacoatpttation phase wilt almost 
certainly be necessary If the performance of our ooe* generator it to approach 
that of conventional ad hoc code generator. 

MetacompDatlon Is closely related to current work te toe fWd of automatic 
program synthesis. The apecrflcatkM 9mM*.tM>miMML.mvmto-hm*muv of 
the aame characteristics as descriptions used m these synthesis systems [Green]: 
a pattern-based transformation aystem I* used as tos knowledge hana by both 
systems. This commonality promisee to allow many a* the same techniques to be 
used in the analysis of the specification. Thla area of research la atHI virgin 
territory with the same promises of success end tolhw offered by any frontier. 
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